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By 
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Chairman:  Dr.  D.  W.  Heam 

Major  Department:  Industrial  and  Systems  Engineering 

The  generalized  Inverse  dual  (GID)  and  its  properties  are  developed 
for  convex  quadratically  constrained  quadratic  programs,  based  on  the 
Moore-Penrose  generalized  inverse  of  a  matrix.   The  resulting  dual  problem 
has  a  concave  objective  function  and  the  constraints  are  shown  to  be 
effectively  linear.   The  closure  of  the  dual  constraint  set  is  character- 
ized by  an  index  maximal  dual  vector. 

It  is  also  shown  that  the  GID  is  equivalent  to  the  conjugate  function 
dual  (CFD) ,  developed  through  conjugate  function  theory  or  generalized 
geometric  inequalities,  and  has  significantly  fewer  variables. 

Coiiq)arison  of  the  two  duals,  GID  and  CFD,  for  a  projected  gradient 
algorithm,  demonstrates  that  for  strictly  convex  problems  the  GID  is 
favored  over  the  CFD;  some  experimental  results  for  the  GID  are  given. 
For  nonstrlctly  convex  problems  some  advantages  of  the  GID  are  discussed, 
e.g., determining  an  Initial  dual  feasible  vector. 

Five  areas  of  application  where  the  GID  could  prove  to  have  signifi- 
cant computational  advantages  are  discussed:  multifacility  Euclidean 
distance  location  problem,  stochastic  programming,  the  general  Fermat 
problem,  portfolio  selection,  and  general  convex  programming. 
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CHAPTER  1 
INTRODUCTION 

This  dissertation  Is  concerned  with  convex  quadratlcally  constrained 
quadratic  programs. 

Various  examples  and  references  to  such  programs  have  occurred  In 
the  mathematical  programming  literature.  John  [37]  In  his  classic 
paper  of  1948  presented  as  an  example  a  quadratlcally  constrained  program 
with  a  linear  objective  function.   The  special  structure  of  quadratlcally 
constrained  quadratic  programs  was  also  recognized  by  Kuhn  and  Tucker  [40] 
and  used  as  an  example  saddle  point  problem. 

Other  examples  of  quadratlcally  constrained  quadratic  programs  are 
found  In  the  field  of  chance-constrained  programming;  see  Chames  and 
Cooper  [11],  Euclidean  distance  location  problems,  see  Elzlnga  and 
Hearn  [25],  Ream  [34],  and  Francis  [29,30,31]. 

Some  authors  have  addressed  specific  quadratic  problems.   For 
example.  Bellman  [3]  discusses  a  solution  technique  for  a  quadratic 
program  with  one  quadratic  constraint.   The  procedure  Is  based  on 
simultaneous  diagonallzatlon  of  the  two  Hessians  and  a  change  of  vari- 
ables, van  de  Panne  [69]  presents  a  finite  algorithm  for  a  linear 
objective  function  with  linear  constraints  and  one  quadratic  constraint. 

Major  developments  In  this  area  have  been  the  1967  generalized 
geometric  Inequality  dual  of  Peterson  and  Ecker  [51,52,53],  and  the 
conjugate  function  duality  of  Rockafellar  [57].   The  two  developments 
have  one  point  In  common;  neither  appears  to  relate  to  the  duality  of 


Wolfe  [71]  which  has  been  a  powerful  tool  in  mathematical  programming. 

A  more  recent  work  is  that  of  Baron  [2].  Baron  discusses  and 
develops  computational  procedures  for  the  class  of  quadratically  con- 
strained quadratic  programs  using  two  forms  of  the  Lagrangian  function. 
The  duality  theory  employed  was  first  shown  by  Falk  [26]  for  strictly 
convex  programs.   Baron  uses  the  well-known  convex  programming  algorithm 
of  Dantzig  [20]  for  solution  of  these  Lagrangian  dual  problems. 

The  research  reported  herein  develops  a  new  dual  form  for  the  class 
of  convex  quadratically  constrained  quadratic  programs. 

In  Chapter  2  some  extensions  to  the  theory  of  the  Moore-Penrose 
[47,50]  generalized  inverse  for  sums  of  positive  semidefinite  matrices 
are  developed.   In  particular,  general  theorems  are  stated  which  are 
germane  to  the  developments  of  Chapter  3. 

In  Chapter  3  the  generalized  inverse  dual  and  its  properties  for 
convex  quadratically  constrained  quadratic  programs  are  developed.  The 
only  theoretical  basis  required  for  the  generalized  inverse  dual  is 
linear  algebra  and  Wolfe  [71]  duality.   It  is  also  shown  that  the 
corresponding  duals  of  Peterson  and  Ecker  and  Rockafellar  are  derivable 
from  the  generalized  inverse  dual  and  conversely. 

Chapter  4  is  a  discussion  of  a  feasible  direction  algorithm  for  the 
generalized  inverse  dual  and  the  conjugate  function  dual  of  Peterson  and 
Ecker  and  Rockafellar.   It  is  concluded  that  for  a  particular  subclass  of 
convex  quadratically  constrained  quadratic  programs  the  generalized  inverse 
dual  algorithm  should  prove  more  favorable  than  that  for  the  conjugate 
function  dual. 


Chapter  5  presents  five  applications  where  the  generalized  Inverse 
dual  leads  to  more  computationally  tractable  mathematical  programming 
problems. 

Chapter  6  sets  out  specific  areas  of  future  research  based  on  the 
results  reported  herein. 


CHAPTER  2 

THE  MOORE-PENROSE  GENERALIZED  INVERSE 
OF  POSITIVE  SEMIDEFINITE  MATRICES 


2.1  Introduction 

The  development  of  the  duality  theory  for  convex  quadratically 
constrained  quadratic  programs  encompasses  specializations  of  the 
following  extensions  to  the  theory  of  the  Moore-Penrose  generalized 
inverse.  These  extensions  are  developed  and  presented  at  this  time  due 
to  their  importance  and  the  fact  that  they  are  not  known  to  exist  else- 
where in  the  literature. 

The  theory  of  this  chapter  is  restricted  to  sums  of  symmetric 
positive  semidefinite  matrices.  The  reason  for  this  restriction  will 
be  evident  in  Chapter  3. 

2.2  Extensions 

Notationally,  let  Q.,  j  =  l,2,...,m,  be  symmetric  positive  semi- 
definite  matrices  of  order  nxn  and  rank  P..  Let 
m 

Q  =  I  Q.- 

Then  Q  is  a  sjomnetric  positive  semidefinite  matrix. 

Lemma  2.1.  Let  Q  be  the  generalized  inverse  of  Q,  then 

q"q  =  qq". 

Proof  of  Lemma  2.1.   By  the  properties  of  the.  generalized  inverse 
Q  Q  is  symmetric, 


Q'q  -  Cq  Q)' 
=  qV'. 

From  Corollary  A. 3.1,  Q  =  Q^  implies  that  (Q  )  =  Q~,  hence, 

Q"Q  =  qq". 

Q.E.D. 
Theorem  2.1.  Let  Q~  be  the  generalized  inverse  of  Q,  then 

Qj(I-Q"q)  =0, 

Q^(I-QQ")  =0, 

(I-QQ")Qj  =0, 

(I-Q"Q)Qj  =  0. 
Proof  of  Theorem  2.1.  By  definition 

QQ"Q  =  Q, 
Q(I-q"Q)  =  0, 

y  Q.(I-Q"Q)  =  0.  (2.2.1) 

j=l  ^ 

From  Theorem  B.2,  the  trace  of  the  lefthand  side  matrix  of  (2.2.1)  is 

trr  T  Q,(I-Q"Q)]  =  y  tr[Q,(I-Q"Q)]  =  0. 
j=i  ^  j=l     ^ 

Now  Q  is  symmetric  positive  semidefinite  and  by  Theorem  A. 10, 
(I-Q~Q)  is  positive  semidefinite;  hence,  by  Corollary  B. 4.1, 
tr[Q.(I-Q~Q)]  i  0,   for  j  =  l,2,...,m. 
For  a  sum  of  nonnegative  terms  to  add  to  zero  they  must  all  equal 
zero.  Therefore, 

tr(Q.(I-Q"Q))  =  0  and  Q^ (I-Q~Q)  =  0 
by  Theorem  B.4. 

The  other  forms  follow  in  the  same  manner  using  Lemma  2.1  and  the 
alternate  form  of  (2.2.1) 


I      (I-QQ  )Q  =  0. 


Q.E.D. 


A  notational  device  for  weighted  sums  of  matrices  will  be  Q  =  Q(y)  = 
m 

^     y^Qj*  y^  E  E  .   If  there  Is  more  than  one  set  of  weights,  the 


distinguishing  character  will  be  adjoined  to  Q,  e.g.,  y'  and  Q' .  For 

emphasis  the  functional  notation  Q(y)  will  be  retained. 

m 
Corollary  2.1.1.  Let  Q  =  I     y.Q  ,  y  >  0,  j  =  l,2,...,m,  then 

j=l  J  J   J 
Qj(I-Q  Q)  =  0, 

Qj(I-QQ")  =  0, 

(I-QQ~)Qj  =  0, 

(I-Q"Q)Qj  =  0. 

Corollary  2.1.2.  Corollary  2.1.1  holds  when  Q.  is  replaced  by 
Q.,  the  generalized  inverse  of  Q.. 

Proof  of  Corollary  2.1.2.   It  will  suffice  to  show  that  (I-QQ~)Q~  =  0 
since  the  same  approach  is  used  in  all  cases. 

(I-QQ~)Qj  =  (I-QQ~)Q^QjQj 
since  Q.  is  symmetric,  by  Lemma  2.1, 


(I-QQ")Qj  =  (I-QQ")QjQ^Q^  =  0. 


Q.E.D. 
m 


Theorem  2.2.  Let  Q  =  Q(y)  =  J  y  Q  and  Q'  =  Q(y')  =  J  y'.Q  , 

j=l  J  J  j=l  J  -" 

where  y   >  0,  y!  >  0  for  j  =  l,2,...,m,  and  y'  ¥'  y..     Then  Q~Q  =  Q'~Q'. 

Proof  of  Theorem  2.2.   The  proof  will  be  to  show  (I-Q~Q)  and 

(I-Q'  Q')  are  both  generalized  inverses  of  (I-Q  Q)  which  by  the  uniqueness 

of  the  generalized  inverse.  Theorem  A. 2,  implies 


(I-q"Q)  =  (I-Q'~Q'). 

Corollary  A. 8. 2  and  Theorems  A. 9  and  A. 10  show  the  generalized 
Inverse  of  (I-0"Q)  as  being  (I-Q~Q).  Now,  using  Lemma  2.1  and  Corollary 
2.1,1,  it  is  easily  shown  that  (I-0'~Q')  satisfies  the  four  properties 
of  the  generalized  inverse,  e.g., 

U)  (l-Q"Q)(i-Q'"Q')  -  (i-Q~Q)  -  (i-Q"Q)Q'"Q' 

=  (1-Q~Q)  -  (I-Q"Q)Q'Q'~ 

m 

=  (I-q"Q)  -  I    y!(i-Q"Q)Q.Q'" 
j=l  J      J 

=  (I-Q~Q) ,  hence  symmetric 
and  therefore  (I-Q'~Q')  is  also  a  generalized  inverse  of  (I-Q  Q) .  Thus, 

(I-Q"Q)  =  (i-Q'"Q') 
and 

q"q  =  q'"q'. 

Q.E.D. 

Corollary  2.2.1. 
Q-Q  =  Q'Q'" 

QQ"  =  Q'"Q' 
QQ"  =  Q'Q'~. 
Proo'f  of  Corollary  2.2.1.  By  Lemma  2.1, 

Q"Q  =  QQ~, 
Q'-Q'  =  Q'Q'" 

and  the  result  is  immediate. 

Q.E.D. 

Another  important  corollary  to  Theorem  2.2  deals  with  the  column 

space  and  null  space  of  a  matrix.  The  following  two  definitions  are 

stated  for  emphasis. 


Definition.  Let  A  be  an  nxm  matrix;  denote  the  m  columns  of  A  as 

vectors  in  E  ,  so  that  A  =  [a  ,a  ,...,a  ].  The  vector  space  spanned  by 

22m  • 

these  m  column  vectors  of  A  is  defined  as  the  column  space  of  the  matrix 
A. 

Definition.  Let  A  be  an  nxm  matrix.  The  null  space  of  the  matrix 
A  is  defined  to  be  the  set  of  vectors  S  where 

S  =  {y|Ay  =  0;  y  e  e"}. 
If  the  matrix  A  is  symmetric,  the  following  lemma  establishes  the 
equivalence  between  its  null  space  and  orthogonal  complement  of  the 
column  space. 

Lemma  2.2.  Let  A  be  an  nxm  matrix.   The  null  space  of  A  and  the 

orthogonal  complement  of  the  column  space  of  A  are  the  same. 

A  proof  of  this  lemma  is  found  in  Graybill  [32]. 

^    m  ^     m 

Corollary  2.2.2.  Given  Q  =  I  y.Q.  and  Q'  =  ^  y!Q,;  y..  y!  >  0 

j_i   J  J         1=1  J  J   J   J 

for  j  =  l,2,...,m.  Then  the  column  vectors  of  Q  span  the  same  space  as 

do  those  of  Q';  i.e.,  Q  and  Q'  have  the  same  column  space. 

Proof  of  Corollary  2.2.2.   By  Lemma  2.2  and  Corollary  A. 8.1  the 

null  space  of  Q  is  spanned  by  (I-Q~Q)  and  that  of  Q'  by  (I-Q'~Q').  Let 

b  be  in  the  column  space  of  Q.   Then, 

(I-Q~Q)b  =  0, 

Q"Qb  =  b, 
but  by  Theorem  2.2 

Q"Q  =  Q'"Q' 
and 

(I-Q'"Q')b  =0. 


Hence,  b  is  in  the  column  space  of  Q' .  Likewise,  a  vector  b'  in  the 
column  space  of  Q'  is  in  the  column  space  of  Q.  Therefore,  Q  and  Q' 

have  the  same  column  space. 

Q.E.D. 
m 
Corollary  2.2.3.   Given  Q  =  J^     y.Q.  and  for  some  index  set 

j=l  -^  ■" 

I  =  {ip^a ^m  ^'  "i  ^  "'  ^"^^  ^^^^  ^  -   {1.2,. ..,m}  let  Q*  =  ^  ylQ., 

1  ^  Jel  ^  ^ 

y*  >  0  and  y'  ?*  y^»  then  the  column  space  of  Q'  is  contained  in  or  equal 
to  the  column  space  of  Q. 

Proof  of  Corollary  2.2.3.   Take  b'  in  the  column  space  of  Q'  then 

Q'Q'~b'  =  b', 
premultiplying  by  QQ  and  applying  Corollary  2.1.1  it  is  seen  that 

QQ~Q'Q'~b'  =  Q'Q'~b'  =  QQ"b'  =  b'. 
Hence,  the  column  space  of  Q'  is  contained  in  the  column  space  of  Q. 
Now  take  b  in  the  column  space  of  Q, 

QQ~b  =  b. 
Premultiply  by  (I-Q'Q'  );  i.e.,  if  b  is  in  the  column  space  of  Q'  it  will 
have  zero  component  in  the  null  space  of  Q'. 

(I-Q'Q'")QQ~b  =  (I-Q'Q'")b, 
I    y.(i-Q'Q'")Q.Q"b  =  (I-Q'Q'")b, 

by  Corollary  2.1.1  and  clearly  only  if  Q.,  for  j/^I,  is  in  the  column 

space  of  Q'  will  b  have  zero  component  in  the  null  space  of  Q'.  Hence, 

in  general,  the  column  space  of  Q  will  not  be  properly  included  in  the 

column  space  of  Q'. 

Q.E.D. 

An  alternative  and  useful  way  of  describing  the  matrix  Q,  through 

decomposition,  is  based  on  the  following  lemma. 
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LeTTnng  2.3.  An  nxn  sjnranetric  positive  semidefinite  matrix  Q  can  be 
factored  as  Q  =  B  B,  where  B  is  an  nxn  upper  triangular  matrix.  For  Q 
of  rank  p,  B  is  of  the  form 

B  =  [B^o]^ 

B  being  of  order  pxn. 

Proof  of  Lemma  2.3.   It  is  well  known  [  7  ]  that  a  symmetric  positive 
semidefinite  matrix  Q  of  rank  p  can  be  factored  into  B  B  where  B  is  an 
upper  triangular  matrix  of  order  pxn.  Form  the  nxn  matrix  B  =  [B  ,0]  ,  0 
being  a  zero  matrix  of  order  nx(n-p),  then 

"b 


B*^B  =  [B^.O] 


=  B^^B  +  0  =  Q, 


conformability  requirements  being  satisfied. 

Q.E.D. 

The  matrix  Q,  which  is  a  sum  of  symmetric  positive  semidefinite 

matrices  Q . ,  j  =  1,2 m,  can  now  be  expressed  as 


Q=  I  yA^' 


j=i 


j  j  j 


which  can  be  put  into  a  matrix  product  form, 


where 


"   -t  2- 
Q  =  B  Y'^B 


B^  =  [Bi,B2,...,B^] 


and 


Ix 


yi  0  .. 
0   y„  .. 


I  the  nxn  identity  matrix. 


Y  is  defined  by  a  Kronecker  product;  see  Appendix  A  for  definition. 
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A  further  simplification  of  this  notational  form  can  be  realized 


by  noting  that  Y  is  defined  as  having  diagonal  elements  ''y . ,  then 


Q  =  B*^B 


with 

B  =  B(y)  =  YB. 
Theorem  2.3. 

QQ'  =  B^B*^"  =  B~B. 
Proof  of  Theorem  2.3. 
QQ"  =  B*^B(B^B)~ 
By  Corollary  A. 3. 2,  (B*^B)~  =  b"b*^~,  then 
QQ~  =  B*^BB~B*^" 
=  B^^B^^-B^B^- 

=  b"b. 


Q.E.D. 


Corollary  2.3.1. 

B^B^"Qj  =  b"BQj  =  Qj, 

QjB^i^-=QjB-B=Q^. 
Proof  of  Corollary  2.3.1.   By  Corollary  2,1.1, 

WQj  -  Qj. 

and  the  result  is  immediate. 

Q.E.D. 

2.3  Differential  Forms 

In  order  to  express  the  differential  of  the  generalized  inverse 


Q  =  I  y^^A'   y^  ^  °  ^°'^   J  "^  l,2,...,m,  it  will  be  useful  to  state  the 
j=l  J  J   J 


of 

following  lemma 
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The  lenana,  while  well  known,  is  stated  and  proved  to  emphasize  the 
correspondence  between  the  partial  derivative  forms  for  the  nonsingular 
inverse  and  the  generalized  inverse.   It  is  assumed  that  Q  is  a  function 
of  y  with  the  Q  fixed  for  j  =  1,2 m. 


Lemma  2.4.  Given  Q  =  I  y  Q  ,  y  >  0  for  j  =  1,2 m  and  Q  is 

nonsingular,  then 


Proof  of  Lemma  2.4. 

Q.E.D. 

Using  the  alternate  definition  of  the  generalized  inverse  the 
partial  derivative  of  Q  can  be  given  for  Q  singular.   It  is  seen  that 
while  the  proof  is  more  involved  a  similar  form  is  derived  as  for  the 
nonsingular  Q  and  as  expected  the  generalized  form  is  equivalent  to  that 

of  Lemma  2.4  for  Q  nonsingular. 

m 

Theorem  2.4.  Given  Q  =     I     y.Q.,  y,  >  0  for  j  =  1,2 m,  then 

j=l  J  J   J 

Proof  of  Theorem  2.4.  The  alternate  definition  of  the  generalized 
inverse  of  Q  is 

Q~  =  lim  (QQ  +  fi^D'lq 
6+0 


13 


where  (QQ  +6^1)  is  a  nonsingular  symmetric  matrix  for  all  6  >  0.  The 
partial  derivative  of  Q~  in  respect  to  y^^  is  then 

|2l=  lim^—  (QQ  +  6^I)"^Q  +  (QQ  +  S^D'^Qu.         (2.3.1) 
^       6-I-0 
By  Lemma  2.4, 

^—   (QQ  +  6^1)"^  =  -(QQ  +  6^i)"\Qi^Q  +  QQk)(QQ  +  6^i)"\ 
which  when  substituted  into  (2.3.1)  results  in, 

^  =  lim  (QQ  +  6^I)"^Qt 
6+0 

-lim  (QQ  +  6^I)~^Qj^Q(QQ  +  6^1)"  Q 
6+0 

-lim  (QQ  +  6^I)~  QQ^(QQ  +  6^1)"  Q. 
6+0 

Noting  that  lim  (QQ  +  6 D  Q  =  Q 

6+0 
then 

lim  0(QQ  +  6^1)" ^Q  =  QQ"  =  Q"Q. 

6+0 

Recalling  Corollary  2.1.1,  it  is  seen  that, 

]    A.  "'•  2         1 

lim  (QQ  +  6^)"^Qi,Q(QQ  +  6^I)~  Q  =  li""  (QQ  "*"  <5  D"  %- 
6+0  6+0 

Therefore, 

-^  =  -lim  (QQ  +  6^I)~^QQu(QQ  +  6^1)"  ^Q 
971.  "^ 

6+0 
which,  by  the  alternate  definition  of  the  generalized  inverse  is 


equivalent  to 


f-«-Q-Q,Q"- 
3yk      k 


Q.E.D. 


CHAPTER  3 


GENERALIZED  INVERSE  DUAL  FOR  CONVEX  QUADRATICALLY 
CONSTRAINED  QUADRATIC  PROGRAMS 


3.1  Introduction 

The  material  of  this  chapter  Is  an  extension  of  duality  theory 
developed  from  the  classical  Lagranglan  analysis.   In  particular,  the 
theory  Is  developed  from  the  duality  of  Wolfe  [7l]. 

The  philosophy  of  the  analysis  parallels  that  of  Falk  [26  !•  Falk's 
results  are  for  more  general  problems,  but  have  limited  usefulness  In 
algorithmic  application.  His  results  for  strictly  convex  problems  are 
extended  to  convex  quadratlcally  constrained  quadratic  programs. 

Other  duals  for  quadratlcally  constrained  quadratic  programs  are 
those  of  Peterson  and  Ecker  [51,52,53]  and  Rockafellar  [57].  The 
Peter son-Ecker  dual  Is  established  by  use  of  a  generalized  geometric 
Inequality  and  the  Rockafellar  dual  Is  by  conjugate  function  analysis. 

The  work  of  Peterson  and  Ecker  and  Rockafellar  motivated  a  number 
of  results  In  this  chapter.   It  Is  shown  that  the  smaller  generalized 
Inverse  dual  has  the  same  desirable  features  as  the  conjugate  dual  and 
that  the  duals  are  derivable  from  each  other.  All  proofs  are  new  and 
require  only  linear  algebra.  Furthermore,  the  generalized  Inverse  dual 
reveals  a  useful  characterization  of  the  constraint  set  not  available 
with  the  conjugate  dual  form. 
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3.2  Duality 

The  following  development  establishes  the  generalized  inverse  dual 
and  its  properties  for  the  class  of  convex  quadratically  constrained 
quadratic  programs. 

The  primal  problem  is  stated  in  the  general  form, 

{P.l}   minimize      $^(x)  =  y  x'^QgX  +  h^x  +  c^ 

X 

subject  to:   * .  (x)  <=  j  x   Q  x  +  h.x  +  c  <  0 
for  j  =  1,2, ... ,m 
where  Q.  is  an  nxn  real,  symmetric,  positive  semidefinite  matrix, 

h  e  e",  c  e  E^  for  j  =  0,1,2 m. 

Theorem  3.1  is  the  major  result  of  this  chapter  and  is  based  on  the 
theory  of  the  Moore-Penrose  generalized  inverse  of  a  matrix. 
Theorem  3.1.  The  Wolfe  dual  of  {P.l}  is  equivalent  to 

{D.l}   maximize     i|j(y)  =  -  -^  y  U   Q~Hy  +  c  y 


subject  to:   y  =  (yo.y1.y2* • • • 'ym^  ^  ° 


QQ"Hy  =  Hy 


where 


H=  (h^.h^.h^ h^), 

Q  =  Q(y)  =  I    y.Q.. 

^  =  (^o'S'^2 ''m^''' 

and  Q~  is  the  generalized  inverse  of  Q. 

Proof  of  Theorem  3.1.  The  Wolfe  dual  [71]  of  {P.l}  is 
{W.l}   maximize      ij^(x,y)  =  -j  x  Qx  +  y  H  x  +  c  y  (3.2.1) 


subject  to:   y  =  (yQ.yi' •  •  •  .yn,^  - 


0 
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Qx  +  Hy  =  0  (3.2.2) 

y  =  1 
•'o 

H,  Q,  and  c  defined  as  above. 

By  Theorem  A. 7,  (3.2.2)  can  be  expressed  as 

X  =  -Q"Hy  -  (I-Q"Q)g  (3.2.3) 

for  g  e  E*^.  Equation  (3.2.2)  implies  that  for  a  given  nonnegative  y  to 
be  feasible  Hy  must  lie  in  the  cblumn  space  of  Q.  Hence, 

QQ"Hy  =  Hy 
as  stated  in  Theorem  A. 6.   Substitution  for  x,  by  (3.2.3)  in  (3.2.1) 
results  in 

ip(y)  =  -  ^  yVQ"(y)Hy  +  c^'y. 

Thus,  {W.l}  and  {D.l}  are  equivalent. 

Q.E.D. 

The  proof  that  i|/(y)  is  concave  on  the  feasible  set  will  be  deferred 
until  section  3.4.  This  is  done  due  to  the  proof  being  more  direct  when 
based  on  differentiability  of  ^(y). 

As  a  result  of  the  equivalence  of  {D.l}  to  {W.l}  the  major  dMIity 
results  of  Wolfe  can  be  applied  to  {D.l}.  The  weak  duality  theorem 
states  that  ijj(y)  <  *  (x)  for  y  feasible  to  {D.l}  and  x  feasible  to  {P.l} 
and  by  Wolfe's  duality  theorem  i|)(y  )  =  *  (x  )  for  y  and  x  optimal. 
Also,  the  unbounded  dual  theorem  and  the  no  primal  minimum  theorem  apply 
to  {P.l}  and  {D.l}. 

Corollary  3.1.1.   If  each  of  the  Q  of  {P.l}  is  a  diagonal  matrix 
then  the  dual  {D.l}  is  a  fractional  programming  problem  with  linear 
constraints. 

Proof  of  Corollary  3.1.1.  Clearly  Q  is  a  diagonal  matrix  of  rank  r, 
then  by  a  suitable  transformation 
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where 


Q  =  (q^j)  =  (q^jCy)) 


q   =  polynomial  of  degree  one  in  y 


i  =  1,2,. ...r 

q^j  =  0  for  i  ?«  j 

q   =  0  for  i  =  r+l,r+2, . . . ,n. 
Furthermore,  reflecting  the  above  transformation,  let 

Hy  =  U^,^2,...,Q^ 
where 

5  ,  i  =  1,2 n,  is  a  polynomial  of  degree  one  in  y. 

Using  Theorem  A. 10  and  simplifying,  {D.l}  becomes 

.2 


c 


o 


{D.2}   maximize      i|i(y)  =  -j     I    - — +  I  c  y  + 

^  i=l  ^ii   j=l  ^  -" 

subject  to:   Y^  >  0,  j  =  l,2,...,m 

i^^   =  0,  k  =  l,2,...,n-r. 

Q.E.D. 

A  subclass  of  {P.l}  for  which  the  only  dual  constraint  is  non- 
negativity  is  characterized  by  $  (x)  being  strictly  convex.   This 
corresponds,  by  a  well-known  theorem,  to  Q  being  positive  definite. 

Corollary  3.1.2.  Given  {P.l}  has  a  strictly  convex  objective 

function  then  {D.l}  reduces  to 

1  t  t^-1     -t 
{D.3}   maximize      ij;(y)  =--2yHQHy  +  cy 

subject  to:   y  =  (y^.yi.ya' • • • »ym)  ^  ^ 

^0  =  ^- 
Proof  of  Corollary  3.1.2.  Q  is  positive  definite  and  therefore 

nonsingular.  Hence,  Q  is  positive  definite  and  nonsingular.  The 
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generalized  inverse  of  Q  is  then  Q 


Q.E.D. 


Note  that  if  y^  >  0  for  some  Q.  positive  definite,  the  same  result 
holds.  This  corresponds  to  a  strictly  convex  $,  being  an  active  primal 
constraint  so  that,  via  complementary  slackness,  y.  >  0. 

Corollary  3.1.3.   {D.3}  is  a  fractional  programming  problem. 

Proof  of  Corollary  3.1.3.   Each  element  of  Q   is  a  polynomial  in 
the  y.  divided  by  a  polynomial  in  y  ,  that  is,  let 

Q"^  =  (iij)  =  (iijCy)) 

—  f"Vi  ^ 

where  q.,  is  determined  by  the  co- factor  of  the  ji   element  of  Q  and  is 
a  polynomial  in  the  y .  of  degree  n-1  divided  by  the  determinant  of  Q 
which  is  a  polynomial  in  the  y.  of  degree  n. 

The  vector  Hy  has  elements  which  are  polynomials  in  the  y.  of  degree 

t  t^-1 
1.  Therefore,  y  H  Q  Hy  will  be  a  polynomial  of  degree  n+1  divided  by  a 

polynomial  of  degree  n. 

Q.E.D. 

{D.l}  is  a  function  of  dual  variables  only;  the  number  of  dual 
variables  being  equal  to  the  number  of  primal  constraints.   In  this 
respect,  it  is  a  theoretical  improvement  over  previously  developed  duals 
for  the  class  {P.l}. 

To  expand  on  {D.l}  and  investigate  its  potential  as  a  computational 
tool,  the  characterization  of  the  constraint  set  must  be  simplified. 
The  constraint  set  will  be  shown  to  be  convex,  but  in  general  it  is 
neither  open  nor  closed.  The  approach  is  then  to  define  its  closure  and 
relative  interior. 

It  will  be  seen  that  there  is  a  unique  characterization  of  the 
closure  and  relative  interior  of  the  constraint  set  determined  by  a 
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specially  structured  feasible  dual  vector.  In  fact,  by  just  knowing  the 
form  of  this  vector  the  dual  problem  is  significantly  simplified  in  that 
the  constraints  reduce  to  linear  equality  and  inequality  forms. 

The  importance  of  the  closure  characterization  will  be  demonstrated 
by  showing  that  the  dual  problem  solved  on  the  closure  of  the  constraint 
set  results  in  the  same  solution  as  if  solved  over  the  original  constraints, 

3.3  Feasible  Set  F 

The  set  of  feasible  points  for  {D.l}  can  be  expressed  by 

F  =  {y  e  E^^  I  yo  =  ^'  Q(y)Q'<y)Hy  =  Hy} 

where  y  =  (y  ,y y  )*'  and  E    is  the  nonnegative  orthant  of  E 

Theorem  3.2.  The  feasible  set  F  is  convex. 

Proof  of  Theorem  3.2.  Let  y^y^  e  F  and  y  =  Xy  +  (l-X)y  , 

1   2 
0  <  X  <  1.   By  y  ,y  feasible, 

Q(ybQ"(y^)Hy^  =  Hy\ 

•^   2  '-   2    2      2 

Q(y  )Q  (y  )Hy  =  Hy  , 

and  Xy^  +  (l-X)y^  =  y^  >  0  with  y^  =  1.  Furthermore,  by  Corollary  2.1.1, 
noting  that  if  y^  >  0  then  vl  >  0   and  if  yj^  >  0  then  Yj^  >  0. 

[i-Q"(y^)Q(y^)]Q(y^)  =  o. 


[i-Q'(y^)Q(y^)]Q(y^)  =  o. 


Hence , 


[l-Q"(y^Q(y^]{XQ(y^Q"(ySHy^  +  (l-x)Q(y^Q~(y  )Hy  }  =  0 


and     [i-Q"(y  )Q(y^)]Hy^  =  0. 
Thus,    y^  =  Xy^  +  (l-X)y  e  F. 


Q.E.D. 
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A  new  notion  will  now  be  introduced  which  leads  to  a  linear 
characterization  of  the  closure  of  F,  designated  by  F*.  The  notion  is 
that  of  an  index  maximal  dual  feasible  vector.  The  existence  of  such  a 
vector  is  assured  for  all  dual  feasible  problems.  The  disadvantage  is 
that  it  is  not  easily  computed. 

To  introduce  the  concept  of  an  index  maximal  dual  feasible  vector 
the  index  set  of  feasible  dual  vectors  is  defined. 

Definition.  The  index  set  of  feasible  dual  vectors  is 
W(y)  =  tj|yj  >  0}. 
Note  that  for  every  y  e  F,  0  e  W(y). 

The  size  of  W(y)  is  defined  as  the  number  of  elements  it  contains. 

Definition.  The  size  of  the  index  set  of  feasible  dual  vectors 
is 

S(y)  =  r 
where  r  is  the  number  of  dual  variables,  including  y^,  that  are  positive, 

e.g.,  given 

W(y)  =  {O.J^,J2 ^r-l^ 

then 

S(y)  =  r. 

Definition.   If  y*  e  F  and  S(y*)  >  S(y)  for  all  y  e  F  then  y*  is 
an  index  maximal  dual  feasible  vector. 

It  is  clear  there  will  exist  such  a  vector  for  every  dual  feasible 
problem. 

In  referring  to  such  vectors,  the  shortened  term  "index  maximal" 
will  often  be  used,  but  where  emphasis  is  required  the  full  term  will  be 
employed. 
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Theorem  3.3.  Given  y  e  F  then 

(i-Q"(y)Q(y))Qj  =  Qj(i-Q"(y)Q(y))  =  o 

for  j  e  N(y). 

Proof  of  Theorem  3.3.   The  proof  is  immediate  from  Corollary  2.1.1 

on  noting  that  y  >  0  for  j  e  N(y). 

-*  Q.E.D. 

Corollary  3.3.1.  Let  y,  y'  e  F  such  that  N(y)  =  M(y').  Then  Q(y) 
and  Q(y')  span  the  same  column  space  and  Q(y)Q  (y)  =  Q(y')Q  (y')- 

Proof  of  Corollary  3.3.1.  The  proof  follows  from  Corollary  2.2.2 
by  again  noting  that  y  >  0  for  j  e  W(y)  and  Theorem  2.2. 

Q.E.D. 

Theorem  3.4.  Given  y*  e  F  is  an  index  maximal  dual  feasible  vector 
and  y  e  F»  then 

W(y)  E  W(y*)- 

Proof  of  Theorem  3.4.   If  F  consists  of  a  single  point  the  theorem 
is  trivial;  therefore,  assume  that  F  contains  more  than  one  point. 

The  proof  will  be  by  contradiction.  Assume  that  N(y)  ^Niy*).     y* 

index  maximal  implies  that  N(y*)  «^  N(y)- 

2 
Since  F  Is  convex  there  exist  y  e  F  such  that 

y^  =  Xy*  +  (1-X)y»  for  some  0  <  X < 1.  Hence, 

N(y^)  =  N(y*)  U  W(y) 

=  {j|y*  >  0  or  yj  >  0}. 
This  leads  to  a  contradiction  on  y*  being  index  maximal,  i.e., 

jV(y*)  c=  W(y  )  implying  that  S(y^)  >  S(y*). 

Q.E.D. 

Theorem  3.5.  Let 

F*  =  {y  e  e"^^  I  y  =  l,y*  index  maximal,  [I-Q"(y*)Q(y*)]Hy  =  0}, 

then  FE  F*. 
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Proof  of  Theorem  3. 5.  Let  y*  be  index  maximal  and  y  e  F,  then  by 
Theorem  3.4,  W(y)  c  W(y*).  y  e  F implies  Q(y)Q~(y)Hy  =  Hy;  multiplying 

by  [l-Q"(y*)Q(y*)]. 

[l-Q"(y*)Q(y*)]Q(y)Q"(y)Hy  =  [i-Q"(y*)Q(y*)]Hy. 
By  Theorem  3.3 

[I-Q-(y*)Q(y*)]Q(y)  =  0, 
therefore, 

[I-Q"(y*)Q(y*)]Hy  =  0 
for  all  y  e  F  and  F  i^  F*. 


F*  is  closed  and  convex  and  if  * . (x)  is  linear  for  j  =  1,2, 


Q.E.D. 
,m,  F*  =  F. 


A  significant  consequence  of  this  theorem  is  that  while  the  concept 

of  an  index  maximal  dual  feasible  vector  implies  dual  feasibility  it  is 

only  necessary  to  know  W(y*).  That  is,  by  Corollary  3.3.1  any  vector  y 

such  that  W(y)  =  W(y*)  will  suffice  for  defining  the  matrix  (I-Q~(y*)Q(y*)) 

The  most  logical  being  that  vector  which  has  Y^  =  1  for  j  e  W(y*)  and 

y  =  0  for  j  i   N(y*).  Therefore,  the  set  W(y*)  and  not  the  particular 

y*  defines  F*.  For  an  example  of  this  idea  let 

1111  0000 

1111  0400 

1122    '   ^1"   0055 


%' 


112     2 


0     0     5     5 


and 


y*  =    (1,3)     resulting  in 


Q(y*)   = 


1 

111' 

1 

13  1     1 

1 

1     17  17 

1 

1     17  17_ 

and  Q~(y*)  = 


192 


220     -16     -6     -6 

-16       16       0       0 

-6033 

-6  0       3       3 
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Finally, 


Q(y*)Q"(y*) 


Now  let  y  =  (1,1)  ,  then 


10   0   0 
0   10   0 
0   0  1/2  1/2 
0   0  1/2  1/2 


Q(y) 


and  Q(y)Q"(y)  = 


1111 
15  11 
117  7 
117     7 


10       0       0 
0     10       0 
0     0     1/2  1/2 
0     0     1/2  1/2 


Q"(y)  = 


2A 


34  -6  -2  -2 
-6600 
-2011 
-2011 


Thus  it  is  seen  that  Q(y*)Q~(y*)  =  Q(y)Q~(y)   for  W(y*)  =  N(y). 

An  example  of  F  c  F*  is   the  following.      Take  the  primal  problem 

1.10     0  1 

{E.l} 


x-5  <  0 


minimum 

*^(x) 

1      t 
=  2^ 

1 
0 
0 

0 
0 
0 

0 
0 
0 

X    + 

1 
-1 

-2 

subject  to: 

*l(x) 

1     t 
=  2^ 

0 
0 
0 

0 

1 

0 

0 
0 
0 

X   + 

'o' 

0 

1 

t 

*2(x) 

1     t 
=  2^ 

0 
0 
0 

0 
0 
0 

0 
0 

1 

X    + 

0 

1 

0 

t 

x-4  <  0 


The  primal  solution  is  x°  =  (-1,2,2)  and  *q(x°)  =  -6.5.  For  the  dual 


problem 


10  0 
0  yi  0 
0   0   y. 


Hy  = 


1 

-1  +  y, 
-2  +  y^ 


and  Q  is  seen  to  be  nonsingular  for  y  =  (1,1,1)  ,  an  index  maximal 
dual  feasible  vector.  Hence, 
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[l-Q"(y)Q(y)]  =  0 


and 


F*  =  {y^  =  1.  y^  >  0,  y^  >  0}. 
F  Is  seen  to  differ  from  F*  by  noting  that  no  y  =  (1,0, y^)  *   y^"^  ^ 
or  y  =  (l,y  ,0)^,  y  ?«  2  is  in  F.   See  Figure  3-2(c). 

F  =  {y  =  1,  y^  >  0,  y^  >  0}  u  {(1,0,1), (1,2,0)} 

and  Ftr  F*,  F  ?«  F*. 

Another  example  is  as  follows: 


{E.2}    minimum     ^q^^^  ^  "2  ^ 


10  0 
0  0  0 
0     0     1 


X  + 


subject  to:    $j(x)   =  "T  ^ 


10  0 
0  0  0 
0     0     1 


X  + 


x-1  <  0 


*2W   =  '2  ^ 


10  0 
0  0  0 
0     0     1 


X  + 


x-1  <  0 


The  primal  solution  is  x°  =  (1,1,1)*^  and  *  (x°)  =  -5.   The  dual  has 


Q  = 


1+y  +y    0 
0 
0 


0 
0    0 


0  1+y  +y 
•^1  ■'2. 


,   Hy  = 


-2-y, 
-2+y,-hr, 

-2-y„ 


(I-Q"Q)  = 


0  0  0 
0  10 
0  0  0 


for  all  y,,y2  >  0,  and  F  =  F*.  An  index 


maximal 


dual  feasible  vector  will  have  yi»y2^^  ^"^  ^^  ^^  ^^^°  ^^^*- 
'O 


(I-Q"Q)Hy  = 


-2+yj+y2 


F  =  F*  =  {yj,.yi.y2  >  o|yo  =  ^'^i  "^  ^2  =  '^^' 
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Before  stating  the  next  three  corollaries  it  will  be  useful  to 
state  two  definitions  dealing  with  the  topology  of  convex  sets.  These 
concepts  and  their  properties  are  developed  by  Fenchel  [27]  and  Rocka- 

fellar  [58]. 

Definition.  The  affine  hull  of  C  e  e",  aff  C,  is  the  intersection 
of  the  collection  of  affine  sets  M  such  that  C  c  M. 

Definition.  The  relative  interior  of  a  convex  set  C,  ri  C, 
consists  of  the  points  x  e  aff  C  for  which  there  exists  an  e  >  0, 
such  that  y  e  C  whenever  y  e  aff  C  and  | |x-y| |  <   e.   In  other  words, 

ri  C  =  {x  e  aff  C  |  e  >  0  and  (x  +  eB)  0  af f  C  cr  c} 
where  B  is  the  unit  ball  in  e",  B  =  {x| | |x| |  <  1},  and 

X  +  eB  =  {x'l   ||x-x'||  <  £}. 
It  is  evident  that  for  a  convex  set  C 

ri  C  "^  C  t=  cl  C, 
cl  C  is  the  closure  of  C. 

The  following  corollary  of  Rockafellar  [58,  Corollary  6.3.1], 
relating  the  closure  of  two  sets  given  a  hypothesis  on  the  relative 
interiors,  is  stated  without  proof  as  a  lemma. 
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Lemma  3.1.     Let  C     and  C       be  convex  sets  in  E   .      Then  cl  C     =  cl  C 
12  1  ■< 

if  and  only  if  ri  C^  =  ri  C^. 

The  following  is  an  important  consequence  of  Theorem  3.5. 

Lemma  3.2.  aff  F  =  aff  F* 

Proof  of  Lemma  3.2.  Clearly,  if  F  =  F*  the  lemma  is  trivial. 
Therefore,  assume  that  F  ?*  F*.  Now  F^^  F*;  therefore,  af f  Ftr  aff  F*. 
Let  y*  e  F^F*  be  index  maximal  and  y  e  F*  but  y  i  f.      F*  is  convex 
hence,  for  0  <  X'  <  1  there  exist 
y»  =  x'y*  +  (l-^')y  e  F*. 
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From  this, 

A/(y')  5  WCy*)^  W<y')» 
thus, 

W(y')  =  W(y*)  by  y*  being  Index  maximal  and 

Q"(y*)Q(y*)  =  Q~(y')Q(y')  by  Corollary  3.3.1. 
Then  y'  e  F  and  the  line  passing  through  y*,  y*,  y  is  in  the  affine  hull 
of  F,  i.e., 

aff  F*c  aff  F, 

which  combined  with  aff  F =  aff  F*  results  in  aff  F  =  aff  F*. 

Q.E.D. 
The  next  theorem  and  associated  corollaries  establish  the  importance 

of  the  concept  of  an  index  maximal  dual  feasible  vector. 

Theorem  3.6.  ri  F  =  ri  F*. 

Proof  of  Theorem  3.6.   By  Lemma  3.2,  aff  F  =  aff  F*.   If  F  =  F* 
the  result  is  trivial;  hence,  assume  that  F<=  F*  =  cl  F*.  Also,  it  is 
clear  that  if  y  e  ri  F  then  y  e  ri  F*. 

Now  take  y  e  ri  F*.  There  exist  y',  y*  e  F*  such  that  y*  is  index 
maximal  and  because  y  e  ri  F*, 

y  =  Xy*  +  (l-X)y' 
for  some  0  <  X  < 1.   From  this  it  is  seen  that  y  is  also  index  maximal 
and  hence  in  F,  by  application  of  Corollary  3.3.1. 

Select  y"  e  (y  +  eB)  H  af f  F*,  that  is,  select  y"  in  the  nonempty 
e-neighborhood  of  y  such  that  |  |y-y"|  |  <  6  =  -j  e.  Then  there  exist  y' " 
in  the  e-nelghborhood  such  that 

y"  =  X"y  +  (1-X")y'" 
for  some  0  <  X"  <  1.  Hence,  y' '  is  an  index  maximal  dual  feasible 
vector  in  F.  Thus,  (y  +  "j  eB)  O  aff  F<=F,  implying  y  £  ri  F  and 
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ri  F  =  ri  F*. 

Q.E.D. 

Corollary  3-6.1.  cl(F)  =  F*. 

Proof  of  Corollary  3.6.1.   By  definition  F*  is  closed.  Using 

Theorem  3.6  and  Lemma  3.1  the  result  is  immediate. 

Q.E.D. 

Corollary  3.6.2.  y  e  F(F*)  is  an  index  maximal  dual  feasible 
vector  if  and  only  if  y  e  ri  F(ri  F*). 

Proof  of  Corollary  3.6.2.  Assume  F  consists  of  more  than  one 
point,  otherwise  the  corollary  is  immediate. 

Take  y  e  ri  F,  then  there  exists  points  y*,  y'  e  F*  such  that  y*  is 
index  maximal  and  y  =  Xy*  +  (l-X)y'  for  some  0  <  X  <  1.  Hence, 
W(y)  =  W(y*)  and  y  is  index  maximal. 

Now  take  y  index  maximal  in  F*  and  let  E,   =  min  {y. }.  Then 

jeW(y) 
any  y'  £  (y  +  j    5B)  D  aff  F*  is  in  F*.  Therefore,  y  e  ri  F*(ri  F). 

Q.E.D. 
The  implications  of  the  preceding  corollary  are  significant. 
First,  there  can  be  no  dual  of  {P.l}  which  exhibits  a  closed  F  properly 

m+l 

contained  in  the  interior  of  E_^  .  Furthermore,  relative  boundary 
points  of  F  are  always  on  at  least  one  of  the  coordinate  hyperplanes  of 
^mfl^  F  is  assumed  to  consist  of  more  than  one  point. 

Figure  3-1  consists  of  three  convex  sets,  m  =  2,  which  are  not 
permissible  as  F;  while  Figure  3-2  consists  of  three  examples  of  F, 
m  =  2  again. 


^2      ♦ 
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(a) 


(b) 


(c) 


(y^  =  1,   not   shown) 


Figure  3-1 
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y,    H 


(a) 


(b) 


(c) 


(y^  =  1,   not   shovm) 


Figure  3-2 
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3.4  Oblectlve  Fxmction  \ii{y) 

Having  seen  that  F  can  be  characterized  by  F*,  i.e. ,  ri  F  =  ri  F* 
and  cl  F  =  F*,  the  next  step  is  to  investigate  ,j;  (y)  on  F*.  This  is  done 
by  initially  looking  at  the  vector  valued  function  f(y)  over  F; 

f(y)  =  Q"(y)Hy. 
It  will  be  shown  that  4)(y)  is  concave,  continuous,  and  twice  differ- 
entiable  on  F.  Also,  ijj(y)  approaches  negative  infinity  as  a  relative 
boundary  point  of  F,  not  in  F,  is  approached. 

The  definition  of  ij)(y)  will  be  extended  to  F*  and  ,j,  (y)  will  be 
shown  to  be  upper  semicontinuous  on  F*. 

Lemma  3,3.  For  y*  e  ri  F*,  y  e  F*,  and  0  <  X  <  1 
f  (X)  =  f (Xy*  +  (l-X)y) 

=  [XQ(y*)  +  (l-X)Q(y)]"H(Xy*  +  (l-X)y) 
f(X)  =  (R^R)"Hy* 

+  (I-R"B(y*))Q"(y)(I-R"B(y*))% 

"*"l  1^1  (I-R'^(y*))Q~(y)  (I-R"B(y*))''Hy* 

_[^j(I_R-£(y*))Q-(y)£t(y*)GM(X)GB(y*)Q"(y)(I-R"B(y*))V* 

2 
_  j^j  (i-R-i(y*))Q-(y)B*'(y*)GM(X)GB(y*)Q"(y)(I-R'B(y*))^Hy 

+(^)  (R^R)'Hy 


where 


R  =  B(y*)(I-Q"(y)Q(y)) 
G  =  I-RR~ 

M(X)  =  {I  +  (j^j    GB(y*)Q~(y)B^(y*)Gr 
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Proof  of  Lemma  3.3. 


[XQ(y*)  +  (i-A)Q(y)]"  =  Y[Q(y*)  +  (^)  Q(y)]'- 

1/2 
Letting  U  =  B^(y)/  [^]    ,  V  =  B  (y*)  and  applying  Cline's  result, 

Theorem  A. 14,  the  result  is  obtained. 

Q.E.D. 

The  term  M(X)  can  readily  be  shown  to  exhibit  the  property 

lim  M(X)  =  I  +  0(X2) 
X4-0 
where  a  matrix  valued  function  is  0(\^)   if  each  element  is  OCX^).  The 

definition  employed  for  OiX^)   is  the  following: 

Definition.  A  scalar  function  T(«),  of  a  real  variable  X,  is  said 
to  be  0(x")  as  X  ^  0  if  T(X)/x"  is  bounded  as  X  ->  0. 

The  next  lemma  establishes  a  useful  relationship  for  some  matrix 
products  inherent  in  f(X). 

Lemma  3.4.  R*^"Q"(y)  =  0  and  R*^"(I-Q~(y)Q(y))  =  R*^". 

Proof  of  Lemma  3.4.  Noting  that 

Q"(y)Q(y)R*^RQ(y)  =  Q"(y)Q(y)(i-Q~(y)Q(y))B*^(y*)RQ(y)  =  0, 

R'^RQ(y)  =  R*'B(y*)(I-Q"(y)Q(y))Q(y)  =  0, 
RV"Q(y)Q(y)R''  =  RV'q^y)  (I-Q"(y)Q(y))B*'(y*)  =  0, 
Q(y)Q(y)R*'  =  Q(y)Q(y)(i-Q"(y)Q(y))B^(y*)  =  0, 

the  hypothesis  of  Theorem  A. 16  is  satisfied  for  [Q(y)R  ]  .  Hence, 

[Q(y)R*']"  =  R^'Q"(y) 

but     Q(y)R*^  =  0. 

R^"  =  R'^"-R*'"Q"(y)Q(y)  =  R*'"(i-Q"(y)Q(y)). 

Q.E.D. 

There  is  a  direct  connection  between  the  matrix  R  and  the  set  of 

dual  feasible  vectors  F.  This  is  exhibited  in  the  following. 
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Lemma  3.5.  If  y*  e  ri  F*,  y  e  F*  and  R  =  B(y*) (I-Q"(y)Q(y))  =  0 
then  y  E  F. 

Proof  of  Lemma  3.5.   By  definition  of  F*.  Q(y*)Q~(y*)Hy  =  Hy. 
Premultiplying  by  (I-Q~(y)Q(y))  results  in 

(i-Q'(y)Q(y))Q(y*)Q'(y*)Hy  =  (i-Q~(y)Q(y))Hy, 

but  by  Theorem  2.3  Q(y*)Q"(y*)  =  B^(y*)B'^"(y*) ,  therefore  by  hypothesis 

(I-Q'(y)Q(y))Hy  =  0  and  y  eF. 

Q.E.D. 

Theorem  3.7.  If  y*  e  ri  F*,  y  e  F*  then 

(a)  lim  t|;(Xy*  +  (l-X)y)  =  -  »  if  y  ^  F, 
X4-0 

(b)  lim  tj;(Xy*  +  (l-X)y)  =  iKy)  if  y  e  F, 
X+0 

hence  ^(y)   is  continuous  over  F. 

Proof  of  Theorem  3.7.  First  it  is  noted  that 

^(xy*  +  (i-x)y)  -  -  I  f^(x)[xQ(y*)  +  (i-x)Q(y)]f(x) 

+  c*^(Xy*  +  (l-X)y) 

and  on  substitution  of  f (X)  from  Lemma  3.3,  it  is  seen  that  for: 

(a)  if  F  =  F*  the  result  is  satisfied  vacuously;  otherwise  one 

term  of  Xf'^(X)Q(y*)f  (x)  is 

2 
M^l   y^H''(R''R)"Q(y*)(R*'R)'Hy 

2 

=  X  (i=^]  yVRV'(i-Q"(y)Q(y))Q(y*)(i-Q"(y)Q(y))R~R''"Hy 

.  X  (Iz^j  %V(RS)-Hy  =  X  [^f  [R''Hy]'[R'"Hy]. 

The  first  equivalence  follows  from  Corollary  A. 3. 2  and  Lemma  3.4.   By 
Lemma  3.5,  R?«Oandy^F  Implies  that  R*^"Hy  ?«  0.  Hence,  the  term 
is  positive  and  as  X  approaches  zero,  ^   approaches  negative  Infinity, 
all  other  terms  being  finite. 
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(b)  y  e  F  implies  Hy  -  Q~(y)Q(y)Hy,  Corollary  A. 3. 2  implies 

(R*^R)"  =  R"r^~  and  using  Lemma  3.4  (R*^R)~Hy  =  0.   The  only 

term  remaining  in  the  limit  is 

y*^H^Q"(y)  (I-R"B(y*))*^Q(y)  (I-R"B(y*))Q"(y)Hy 

which  on  application  of  Lemma  3.4  reduces  to  y  H  Q  (y)Q(y)Q  (y)Hy 

and  lim  i|;(Xy*  +  (l-X)y)  =  i^y)  for  y  e  F. 
X+O 

The  continuity  of  (p(y)  on  F  is  immediate  from  part  (b). 

Q.E.D. 

The  directional  derivative  of  ip(y)  also  exhibits  a  useful  property 

as  relative  boundary  points  are  approached. 

Corollary  3.7.1.   If  y*  e  ri  F,  y  e  F*  then 

lim  ^    iKXy*  +  (l-X)y)  =  +  »  for  y  ^  F. 
X+0  ^^ 

Proof  of  Corollary  3.7.1. 

,j;(Xy*  +  (l-X)y)  =  -  |  f^(X)[XQ(y*)  +  (l-X)Q(y)  ]f  (X) 

+  c*^(Xy*  +  (l-X)y) 
where  f(X)  is  as  developed  in  Lemma  3.3.   From  this,  it  can  be  shown  that, 

lim  4t  ipCXy*  +  (l-X)y)  =  K(y*,y)  +  lim  j  l^l  yV(R^R)"Hy, 
X+0  '^^  X+0  ^  ^  ^  ' 

y*^H'^(R*^R)~Hy  >  0  by  hypothesis  with 

K(y*,y)  =  -  ■i-y*tHt(RtR)-H(y*-2y) 

-y*V(R*^R)"Q(y*)D^Hy 

-|yVDf(Q(y*)-Q(y))D^Hy 
+  (y-y*)VD^Q(y*)(R^R)"Hy 
+  y*'^HVQ(y*)(R*^R)"Hy 
-  y*'H^DjQ(y)D^Hy* 
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+  y^H*'DjQ(y)D2Hy* 

+  c  (y*-y) 
where    D^  =  (I-R"B(y*))Q"(y) (I-B^(y*)R^") 

D^  =  (I-R"B(y*))Q'(y)Q(y*)D^. 
K(y*»y)  is  constant. 

Thus  it  is  seen  that 

lim  ^A|;(Ay*  +  (l-X)y)  =  +  <»  for  y  ^  F. 


X+0 


dX 


Q.E.D. 


Up  to  this  point  ij;(y)  has  only  been  defined  on  F  and  not  on  F*. 
Extending  the  domain  of  ij;(y)  to  F*  can  be  accomplished  by  use  of 
Theorem  A. 13.  That  is,  for  y  e  F*,  y  ^  F  accept  the  best  approximate 
solution  to  the  system  Q(y)x  +  Hy  =  0,  namely,  x  =  -Q  (y)Hy.   Then 
,|,(y)  has  the  same  form  as  on  F  and  is  finite  for  such  boundary  points. 
Hereafter,  ij;(y)  will  be  used  to  represent  the  function  on  both  F  and  F*. 

In  order  to  characterize  ij;(y)  on  F*  the  definition  for  an  upper 

semicontinuous  function  is  stated  for  emphasis. 

n 
Definition.  A  real  valued  function  f  defined  on  a  normed  space  E 

is  said  to  be  upper  semicontinuous  at  x^  if,  given  e  >  0,  there  is  a 

6  >  0  such  that  f(x)-f(x^)  <  e  for  ||x-x^||  <  6. 

From  the  continuity  of  i|»(y)  on  F,  the  definition  of  ij)(y)  over  F*. 
and  Theorem  3.7  the  following  is  immediate. 

Corollary  3.7.2.  ij;(y)  is  an  upper  semicontinuous  function  on  F*. 

The  form  of  the  gradient  and  the  Hessian  of  i|)(y)  will  now  be 
developed.  To  simplify  the  proof  of  the  continuity  of  the  gradient 
and  the  Hessian,  the  next  lemma  is  developed. 
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Stewart  [66]  has  shown  that  a  necessary  and  sufficient  condition 
for  a  generalized  inverse  to  be  continuous  is  that  its  rank  be  constant 
over  its  domain.  This  is  an  extension  of  the  work  of  Ben-Israel  [ 4  ] 
who  first  addressed  the  question. 

The  relative  interior  of  F  is  a  set  of  index  maximal  dual  feasible 
vectors.  Corollary  3.6.2,  which  by  Corollary  3.3.1  imples  that  the 
rank  of  Q  is  constant  over  ri  F.   Therefore,  applying  Stewart's  result 
the  lemma  is  stated  as; 

Lemma  3.6.   f(y)  =  Q~(y)Hy  is  continuous  over  the  relative  interior 
of  F. 

Theorem  3.8.  The  gradient  of  i|)(y),  y  e  F,  is 

ViKy)  =  [*^(-Q"(y)Hy),  i^(-cr(y)ny) *^(-Q"(y)Hy)]*^ 

and  is  continuous  on  ri  F. 

Proof  of  Theorem  3.8.  The  k   component  of  7\()(y)  is 

which  by  Theorem  2.4,  for  k  e  W(y),  reduces  to 

1'^^=  *.(-Q"(y)Hy). 
For  k  i   W(y),  modification  of  the  proof  of  Theorem  2.4  gives 


and 


Then 


1^  =  lim  (QQ  +  62l)  (I-Q"Q)-Q'Q,  Q' 
^^k   6+0  ^ 


-|2l  Hy  =  if  Q'QHy  =  -Q"a  Q"Hy. 

ayfc    ay^        ^ 


^iSll  =  *  (-Q~(y)Hy)  for  k  i.   W(y)  and  the  form  of  the  gradient 
3y,      k 

is  defined. 

The  continuity  of  7iJ^(y)  follows  from  Lemma  3.6,  noting  that 
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■^  Q.E.D. 

This  Is  an  extension  of  the  result  of  Falk  [26]  where  it  is  seen 
that  Vij;(y)  is  defined  at  a  particular  point,  i.e.,  x  =  -Q  (y)Hy,  instead 
of  any  x  =  -Q"(y)Hy-(I-Q"(y)Q(y))g. 

Corollary  3.8.1.   The  jk   element  of  the  Hessian  of  ij)(y),  for 
y  E  ri  F,  is 

V  "  V^^^  °  -y^H^Q"QjQ"Qj^Q"Hy  +  h^Q'Q^Q'Hy 
+  hjQ"QjQ"Hy  -  hjQ'h^. 
The  Hessian  exist  for  each  y  e  ri  F  and  is  continuous  on  ri  F. 
Proof  of  Corollary  3.8.1.  The  proof  follows  from 

Theorem  2.5  and  Theorem  3.8.   The  continuity  of  the  Hessian  follows  from 

the  generalized  inverse  being  continuous  on  ri  F  and  Lemma  3.6. 

Q.E.D. 

The  next  corollary  will  be  used  in  the  proof  of  the  concavity  of 

,j;(y)  on  F. 

Corollary  3.8.2. 

Vij;(y)y  =  i|)(y)  -  *jj(-Q~Hy) 

Proof  of  Corollary  3.8.2. 

m  - 
VK.(y)y  =  n-J  y*'H''Q"QjQ"Hy  -  hJq'Hy  +  c^ly^ 

=  I  y''H*'Q"Hy  -  j  y*'H*'Q"Q^Q"Hy  -  y^H^^Q'Hy 

+  h^Q"Hy  +  c*^y  -  c 
o   -^     -^    o 


=  ij,(y)  -  $„(-Q~Hy) 


Q.E.D. 
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ip(y)  can  now  be  proved  to  be  concave  based  on  the  following  result 
found  in  Mangasarian  [43]. 

ij;(y)  is  concave  on  F  if  and  only  if 

\\i(y^)  -   i|;(y^)  <  Vi|j(y^)  (y^-y^)  for  each  y^,y^  e  F. 

Theorem  3.9.   i()(y)  is  concave  on  F. 

Proof  of  Theorem  3,9.   Take  y^y^  e  F,  then  Q(y2)  =  I  y^Q 

jeW(y2)  ^  \ 
is  positive  semidefinite  and 

0  <  ^  [y2''H^Q"(y2)-ylVQ~(yl)]Q(y2)[Q"(y2)Hy2-Q'(yl)Hyl] 

0  <  I  y2''H^Q"(y2)Hy2  +  ^  yl VQ"(yl )Q(y2)Q"(yl)Hyl 

-  I  y2*'H''Q"(y2)Q(y2)Q-(yl)Hyl  -  |  yl VQ"(yl)Q(y2)Q"(y2)Hy2 
Now  y2  e  F  implies  that  Q(y2)Q~(y2)Hy2  =  Hy2.   This,  and  adding 
c  y2  to  the  inequality; 

ip(y2)  <  I  yi''H^Q"(yi)Q(y2)Q~(yi)Hyi  -  y2VQ"(yi)Hyl  +  cV 

.j^(y2)  <  I     [j  ylVQ"(yl)Q  Q"(yl)Hyl  -  hjQ'(yl)Hyl  +  c  ]y2 
j=0  ■^ 

But  this  is  recognized  by  Corollary  3.8.2  to  be 

i|;(y2)  ^  V4)(yi)y2  +  *^(-Q"(yi)Hyi) 

adding  -ij;(yl)  and  using  Corollary  3.8.2  again,  the  inequality  reduces  to 

ijj(y2)  -  ,|;(yl)  <  Vtj)(yl)(y2-yl)  and  \J;(y)  is  concave  on  F. 

Q.E.D. 

3.5  Optimal  Primal  Solutions 

Wolfe  duality  theory  applied  to  {D.l}  insures  that  the  solution  of 
the  dual  problem  characterizes  the  primal  solution  of  {P.l}.  This  is 
certainly  true  in  the  sense  that  the  optimal  objective  function  value  of 
{P.l}  is  determined  from  the  optimal  dual  solution,  yet  all  optimal 
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primal  variables  are  not  explicitly  defined.  This  is  due  to  the  primal 

solution  being  defined  in  two  parts,  explicitly  on  the  range  space  and 

implicitly  on  the  null  space  of  Q.  Only  that  part  defined  explicitly 

on  the  range  space  of  Q  is  immediately  determined  from  the  optimal  dual 

solution.  This  should  not  be  construed  to  imply  that  an  optimal  dual 

solution  does  not  return  the  optimal  primal  vector;  it  does. 

o 
Assume  {D.l}  has  been  solved  and  the  optimal  dual  vector  is  y  . 

Then  y°  e  F  and  the  optimal  primal  vector  is  given  by 

x°  =  -Q-(y°)Hy°-(I-Q"(y°)Q(y°))g 
for  g  e  e".  The  optimal  primal  vector  will  be  unique  only  if  Q(y  )  has 
rank  n,  i.e.,  I  =  Q"(y°)Q(y°).  Therefore,  to  establish  x°.  g  must  be 
determined. 

Substituting  x°  into  {P.l}  results  in  the  following; 

{P°.l}   minimize      *  (g)  =  -h^(I-Q"Q)g  +  *o(-Q"Hy°) 


subject  to:   $ . (g)  =  -hJ(I-Q"Q)g  +  *j (-Q"Hy°)  =  0 


for  j  e  W(y°) 
1  t 


*j(g)  =f  g''(i-Q"Q)Qj(i-Q  Q)8 

-(hj-y°''H*'Q"Qj)(I-Q'Q)g 

+  $  (-Q'Hy°)  <  0 
for  j  i   N(y°) 
where  Q  is  understood  to  be  a  function  of  y°  and  the  order  of  the  con- 
straints has  been  arranged  so  that  the  first  S(y°)-1  constraints  are 

active,  that  is,  N(y°)  -  {0}  =  (1,2 S(y°)-1}. 

By  complementary  slackness,  $j (g)  for  j  e  N(y°)  -  {0}  is  zero. 
This  results  in  a  consistent  set  of  linear  equalities. 
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Let  H^  =  (h^,h2,...,hg,  o.j^),  an  nx(S(y°)-l)  matrix,  and  *  be 
the  S(y  )-l  vector 

*°  =  [*^(-Q"Hy°),  ^2(-Qrny°),...,i^^^o^_^(-(fEy°)f. 
The  first  S(y°)-1  constraints  of  {P°.l}  are  expressable  as 

Hj(i-Q"(y°)Q(y°))8  =  *°. 

Solving  for  (I-Q  Q)g  results  in 


(I-Q-Q)g=H^-  $°  +  (I-H^~H^)g^  (3.5.1) 


£    r   „n 
for  g  e  E  . 


Theorem  3.10.   *  (g)  is  a  constant  for  y  .   That  is,  the  primal 
objective  function  value  is  fully  determined  by  the  optimal  dual  vector. 

Proof  of  Theorem  3.10. 

*  (g)  =  -h*'(I-Q~Q)g  +  $  (-Q~Hy°)  and  by  the  constraint  set  of 
{D.l} 

-hJ(i-Q-Q)  =  [y>,.yX yX^'^^-^'^^ 

which  can  be  written  as 

-hJ(I-Q"Q)  =  y°V(I-Q"Q) 
where  y°  =  (y°,y° ys(y°)-l^^  ^^  ^°^^'^^   y°  =  0  for  k  >  S(y°). 


Now, 


%(g)   =  y°V(I-Q"Q)g  +  $^(-Q"Hy°) 


and  on  substitution  for  (I-Q~Q)g  by  (3.7.1) 

*  (x°)  =  $„(g)  =  ^''hV-*"  +  4  (-Q"(y°)Hy°). 
o        o  o  o       o 

Q.E.D. 

{p°.l}  is  now  characterized  as  the  minimization  of  a  constant 
subject  to  m-S(y°)+l  quadratic  constraints.  The  following  theorem 
defines  those  instances  when  it  is  unnecessary  to  solve  {?  .1}. 

Theorem  3.11.  The  optimal  primal  solution  is  fully  determined  by 
the  optimal  dual  solution  provided  the  nx(S(y  )-l)  matrix  H^  has  rank  n 
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and  the  optimal  primal  solution  is 


<°   =  -Q"(y°)Hy°-(H^H^)-'H^*°. 


Proof  of  Theorem  3.11.   If  H  has  rank  n,  then  by  Theorem  A. 6, 

o 

H*'"  =  (H  h'')~H  and  (I-H^"H^)  =  (^-(hX^A^o^  =  °-  Therefore, 


(I-Q'Q)g  =  H^"*°  =  (hX)"H^*°  and  x°  =  -Q"(y°)Hy°-(HX)"H^*°  for 

Ho  =  (^'^ ^S(y°)-1>'  (Vo^'  =  ^Vo)"'- 

Q.E.D. 


When  H  does  not  have  rank  n,  {P  .1}  is  a  special  form  of  {P.l} 
o 

which  can  readily  be  solved.   Substitution  by  (3.5.1)  in  {P  .1}, 
noting  that  it  is  only  required  to  determine  a  g  which  satisfies  the 
m-S(y°)+l  constraints,  and  solving  the  following  strictly  convex  auxiliary 
primal  problem  defines  the  component  of  x°  which  is  in  the  null  space  of 

Q(y°). 

{P°.2}   minimize    -j  g  Ig 

subject  to:  * .  (g'^)  <  0,   for  i   i  N(y  ) 


where 


1  rt 


+  [*V"Qj-h5  +  y°^H^Q"Qj](I-H^H;)g' 
M|*°V-Qj-h5+y°V$-Q^]H^-*° 


+  *j(-Q"(y°)Hy°) 

Solving  {P°.2}  by  the  dual  {D.l}  will  return  a  unique  g  ,  Q  will  be 
of  rank  n.  Therefore,  at  most,  two  dual  problems  will  require  solution 
and  at  least  one  will  be  a  strictly  convex  problem  permitting  the  use 
of  Corollary  3.1.2. 
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3.6  Linear  Constraints 

The  class  of  convex  quadratically  constrained  quadratic  programs 
{P.l}  can  be  expanded  to  admit  problems  with  linear  constraints. 

{P.l'}   minimize     *^(x)  =  y  x*^Q^x  +  h^x  +  c^ 


subj 


ect  to:  *  (x)  =  I  x^QjX  +  h^x  +  c^  <  0 


for  j  =  1,2, ... ,m 


1 
4x  +  cj^  <  0 

for  k  =  m  +  l,m  +  2, 

where  m-m,  =  m 
1    2 


By  treating  linear  inequality  constraints  as  quadratic  constraints,  with 
zero  quadratic  terms,  {P.l'}  is  merely  a  special  form  of  {P.l}  and  the 
theory  of  Section  3.2  applies. 

The  more  interesting  case  is  when  {P.l'}  has  linear  equality  con- 
straints. Let  the  set  of  linear  equality  constraints  be  given  by 

Ax=c  (3.6.1) 

where  A  is  an  m  xn  coefficient  matrix  and  c  is  an  m  xl  constant  vector. 

2  ^ 

Given  that  the  primal  problem  is  feasible  there  exist  solutions 
to  (3.6.1).  Using  the  generalized  inverse  and  the  assumption  of 
feasibility, 

X  =  a"c  +  (I-A"A)g  (3.6.2) 

for  all  g  G  e".  Clearly  any  solution  to  {P.l'}  is  of  the  form  (3.6.2) 
and  the  primal  problem  can  be  transformed  to  one  in  g. 

Substituting  (3.6.2)  into  {P.l'}  results  in  the  following 
{P. 2'  }   minimize     *;(g)  =  \  %   ij-k  k^^^^.^-k  k)z  +  (h^  +  c V*^)  (I-A'A)} 

+  (c  +  -^  c V^Q  A"c  +  hVc) 
o   2       o       o 
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subject  to:  $'.(g)  =  ^  g''(I-A"A)Qj  (I-A"A)g  +  (hj  +  cV^)(I-A'A)g 

+  (c  +  "I  c V^Q  a"c  +  hVc)  <  0 
for  j  =  1,2,.  ..,in 
This  transforms  {P.l'}  with  linear  equality  constraints  to  the  form  {P.l}. 

The  advantage  of  the  conversion  is  that  the  dual  problem  {D.l}  is 
reduced  from  a  problem  in  m  dual  variables  to  one  in  only  m^  dual 
variables . 

It  is  possible  to  express  a  set  of  linear  inequality  constraints  as 
a  set  of  equality  constraints  by  the  use  of  slack  variables.  This  would 
be  highly  desirable  based  on  the  potential  reduction  of  the  size  of  the 
dual  problem.  Unfortunately,  the  use  of  slack  variables  will  also 
increase  the  size  of  the  set  of  nonnegativity  constraints  and  the  dual 
problem  size  remains  effectively  unchanged. 

In  conclusion,  it  can  be  said  that  linear  constraints  cause  no 
difficulty  in  applying  the  dual  {D.l}.  While  linear  equality  constraints 
permit  the  formation  of  an  equivalent  reduced  problem,  linear  inequality 
constraints  are  treated  as  special  quadratic  form  constraints. 

3.7  Linear  and  Quadratic  Programming 

Linear  and  convex  quadratic  programs  are  proper  subsets  of  the 
class  {P.l}.  The  dual  {D.l},  therefore,  should  admit  of  special  forms 
for  these  two  subclasses. 

For  linear  programs,  {D.l}  results  in  the  unsymmetric  dual.  This 

is  seen  by  noting  that  Q  =  0  and  the  constraint  set  F  is  given  by 

m 
F  =  {y.  >  0,  J  =  1,2 m|  I     h  y  =  -h  }, 
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the  objective  function  is 

m 
'l'(y)  =  c  +  I  c  y  . 
°   j=l  ^  -' 

The  application  of  {D.l}  to  convex  quadratic  programs  results  in  a 
nonsytnmetric  dual  which  is  again  a  quadratic  program. 
Define  the  general  convex  quadratic  program  as 

{Q.l}    minimize     *^(x)  =  -j  x  Q^x  +  h^x  +  c^ 

subject  to:   * . (x)  =  h.x  +  c .  <  0 

for  j  =•  1,2 m 

where  Q  is  symmetric  and  positive  semidefinite.   {D.l}  reduces  to 
o 

{D.4}    maximize     \|;(y)  =  -  -j  y  H  Q~Hy 

+  (c*'  -  h^Q^H)y 

+   (-  k  h'^Q'h  +  c  ) 
^  2  o^o  o    o 

subject  to:  y  =  (y^.yg. •  •  •  »y^)*^  >  0 
(I-Q;Q^)Hy  =  -(I-Q;Q^)h^ 

where  H  =  (h  ,h  , . . .  .h^^^)  and  c  =  (^^''^2 ^^^    ' 

{D.4}  reflects  a  notational  variation  from  {D.l}  in  that  Q  =  Qq 
and  is  not  a  function  of  the  y..  This  permits  the  implicit  representation 
of  y  =  1  in  {D.4}.   It  is  clear  that  \\i(y)   is  quadratic  and  that  the 
constraint  set  is  linear.  For  Q^  of  rank  r,  there  are  n-r  linear 
constraints  in  {D.4}.  This  is  seen  by  noting  that  there  exists  an 
orthogonal  matrix  P  such  that 
P\P  =  D^. 
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D  is  a  diagonal  matrix  which  has  r  nonzero  diagonal  elements  equal  to 
o 

the  eigenvalues  of  Q  .  The  constraint  set  Is  thereby  reduced  to 

(I-d"d  )P*^Hy  =  -(I-d"d  )P^h  . 
^oo  oo    o 

(I-D  D  )  Is  a  diagonal  matrix  with  r  zero  and  n-r  nonzero  diagonal 

elements.  Hence,  there  will  be  n-r  linear  equality  constraints  for  {D.4}. 

If  Q  Is  nonslngular,  {D.4}  has  only  nonnegatlvlty  constraints  and 
o 

Is  the  dual  form  addressed  by  Lemke  [41 ]• 

3.8  Equivalence  of  Dual  Forms 

In  sections  3.2  through  3.6  the  generalized  Inverse  dual  {D.l} 
and  Its  properties  were  developed  for  convex  quadratlcally  constrained 
quadratic  programs.  {D.l}  was  developed  from  the  Wolfe  dual,  {W.l},  of 
{P.l}.   Three  of  the  significant  advantages  of  {D.l}  over  {W.l}  are 
(1)  {D.l}  is  an  m  variable  dual  while  {W.l}  has  m+n  variables,  (2)  the 
objective  function  of  {D.l}  is  concave  whereas  that  of  {W.l}  is  not,  and 
(3)  the  constraint  set  of  {D.l}  is  also  characterized  by  a  linear  system. 

There  are  two  other  dual  forms  of  {P.l}.  One  is  the  generalized 
geometric  inequality  dual  of  Peterson  and  Ecker  [51.52.53]  and  the 
conjugate  function  dual  of  Rockafellar  [57]. 

The  dual  of  Peterson  and  Ecker  is  applicable  to  all  convex  /Ip- 
constrained  ilp-programs  and  {P.l}  in  particular  while  Rockafellar' s  dual 
is  applicable  to  the  class  of  faithfully  convex  programs.  Faithful 
convexity  being  defined  as: 

Definition.  A  function  f  is  faithfully  convex  if  it  is  not  aff ine 
(simultaneously  convex  and  concave)  along  any  line  segment,  unless  f 
is  aff ine  along  the  entire  line  extending  the  line  segment. 
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Because  Rockaf ellar  [57 ]  has  shown  that  his  dual  and  that  of 
Peterson  and  Ecker  are  equivalent  for  {P.l}  the  designation  {CD.l}  will 
be  employed  to  refer  to  both. 

The  following  form  of  {CD.l}  will  be  used. 

{CD.l}   maximize      T(y,z)  ■=  -  ^  I  -       z  z  +  c  y 

^  jeW(y)  ^j   ^  ^ 

subject  to:   y.  >  0,  j  =  1,2 m 

y  =  1 
•'o 

m         m 
I     bJz  =  I  hy. 
j=0  -•  -^   j=0  -^  -^ 

z  =  0  if  y  =  0 

z .  e  E  .  yj  e  E 

{CD.l}  is  stated  explicitly  in  terms  of  the  dual  variables  y^>  z. 

for  j  =  0,l,2,...,m.  For  notational  convenience  the  number  of  dual 

variables  is  exhibited  as  (m+l)(n+l)  but  in  applications  the  number  of 

m 
dual  variables  will  be  I     p.  +  m  where  p  is  the  rank  of  Q  . 

j=0  J         J  -• 

Both  Peterson  and  Ecker  and  Rockafellar  have  developed  properties 

of  {CD.l}  comparable  to  those  developed  for  {D.l}.  Three  differences 

m 
between  {CD.l}  and  {D.l}  are  seen  to  be  the  number  of  variables,   I     P^  +  m 

j=0  J 
compared  to  m,  the  form  of  the  constraint  set  and  the  form  of  the 

objective  function. 

To  compare  the  duals  {D.l}  and  {CD.l}  it  is  first  noted  that  there 

is  an  alternate  characterization  of  the  system 

Q(y)Q"(y)Hy  =  Hy 

resulting  in  a  consistent  system  of  equations  in  an  expanded  space. 
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Theorem  3.12.  Given  that  Qx  =  -Hy  is  consistent,  i.e.,  QQ  Hy  =  Hy, 

then  there  exists  a  vector  of  nrfl  n-vectors 

,  t  t  t      t.t, 
2  =  (Z0.Z1.Z2 V  ' 

such  that  for  Q.  =  B.B  ,  y^  =  1,  y^j  >  0.  J  =  0,l,2,...,in 

m  m 

I     bU     =-  I     hy  , 
j=0  J  J    j=o  J  J 

is  consistent.  The  converse  holds  if  z  =  0  when  y.  =  0. 

Proof  of  Theorem  3.12.  By  Lemma  2.3,  Q  can  be  factored  into 

Q  =  B^B 

=  B*^YYB 

where    B*"  =  (B^.B^.B^ B^)  for  Q^  =  B^B^ ,  j  =  0,l,2,...,m, 


B .  of  order  nxn  for  all  j , 


and 


Y  =  I  X 


0    ^ 

1 


0  .  . 


.  .  0 


0 


O'vS^ 


of  order  n(m+l)xn(m+l), 


a  Kronecker  product. 

Using  this  factorization  and  Theorem  2.3,  QQ  Hy  =  Hy  reduces  to 

B*^B*^'Hy  =  Hy. 
Hence,  there  exists  a  vector  z'  =  (z^  ,z'^  ,...,z^  )  such  that 

B*^2'  =  ±Hy 


B*Tz'  =  ±Hy. 

/  t  t      tvt 
By  defining  the  m+1  n-vector  z  =  (2^,z^ z^)    , 
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=j  =  z'79- 


It  Is  seen  that 


j=0  >•  ■•   j=0  •'   ^ 


The  converse  Is  obtained  by  reversing  the  above  sequence  of  arguments 

with  z.  =  0  when  y.  =0. 

Q.E.D. 

The  expression  for  the  feasible  set  F  In  the  expanded  dual  space 

Is  now 

F  =  {y  e  ^^^\y^   =  1  and   z .  e  e",  J  =  0,1,..., m 

m 


solving  ^     B.z     =     1     h  y ,  and  y  =  0  Implies 
j=0  ^   ■'   j=0  J  J      J 

Theorem  3.13.  There  exists  a  one  to  one  correspondence  between  the 
feasible  points  of  {D.l}  and  {CD.l}.  Furthermore,  for  such  poliits  the 
associated  objective  function  values  are  equal. 

Proof  of  Theorem  3.13.  The  proof  will  be  by  demonstrating  that 
{CD.l}  can  be  derived  from  {D.l}  and  vice  versa. 

From  {D.l}, 

ij'(y)  =  -  I  yVQ~(y)Q(y)Q"(y)Hy  +  cV 

which  by  the  factorization  of  Q(y)  =  B  (y)B(y)  Is  equivalent  to 

*(y)  =  -  I  [B'''(y)Hy]''[B^"(y)Hy]  +  c^'y 

where  B(y)  =  BY  as  defined  for  Theorem  3.12. 
Define 

z  =  YB^"(y)Hy  =  YBQ~Hy,  (3.8.1) 

z  Is  a  vector  of  m+1  n-vectors  and  the  j   n-vector  Is  zero  If  y  Is  zero. 
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Multiply  (3.8.1)  by  Y~, 

Y~z  =  Y"YBQ"Hy  =  (Y"Y)YBQ'Hy  =  YsQ^Hy 


Y~z  =  B(y)Q"(y)Hy  =  B^"(y)Hy.  (3.8.2) 

Substituting  (3.8.2)  into  i^iiy)   results  in 

T(y,z)  =  -  "I  (Y'z)'^(y"2)  +  cV.  (3.8.3) 

Multiply  (3.8.2)  by  B^(y), 

B*^(y)Y"z  =  B*^YY"z  «  B'^(y)B*^"(y)Hy  =  Hy 
by  the  constraints  of  {D.l}  and  Theorem  3.12. 

By  Theorem  A. 11,  YY  is  a  diagonal  matrix  with  y  >  0  implying  a 
diagonal  element  of  1  and  y.  =  0  implying  0,  resulting  in  YY~z  =  z  by 
definition.  Then  by  Theorem  3.12  the  constraints  of  {D.l}  can  be  stated 
as:' 

B^z  =  Hy  \ 

z.  =  0  if  y  =  0 

y.  >  0  for  j  =  1,2 m  and  y  =  1 

and  the  objective  function  by  (3.8.3),  hence  {D.l}  implies  {CD.l}. 

Now  taking  {CD.l};  it  is  seen  by  Theorem  3.12  that  the  constraint 
set  of  {CD.l}  is  equivalent  to  that  of  {D.l}.   Furthermore, 

B*^z  =  B^YY"z  =  b'^y'z  =  Hy 
by  definition  of  Y  and  z  for  {CD.l}.  This  equation  is  consistent; 

therefore, 

—    ^t-       ^t— *t 
Y  z  =  B  Hy  +  (I-B  B  )g 

for  g  e  E      .   Substituting  this  into  the  objective  function ^ 

equation  (3.8.3),  results  in 

T(y,g)  =  -  I  [(B'''Hy)''(B^"Hy)  +  g^(I-B^"B^)g]  +  cS. 
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The  matrix  (I-B  B  )  Is  a  symmetric  positive  semidefinlte  matrix  and 

It   "t-^t  ^ 

maximizing  -  oT  8  (I-B  B  )g  implies  that  g  is  in  the  range  space  of  B. 

In  particular,  g,  =  0  is  in  the  range  space  of  B  and  so  without  loss  of 

generality  T(y,g)  can  be  simplified  to 

iKy)  =  -  I  (B^"Hy)''(B^"Hy)  +  c'^y. 

Hence,  {CD. 1}  implies  {D.l}. 

Q.E.D. 

The  question  of  which  dual  form  would  be  more  advantageous  for  a 

particular  application  will  be  addressed  in  Chapter  4. 


CHAPTER  4 
DUAL  ALGORITHMIC  CONSIDERATIONS 

4.1  Introduction 

In  Chapter  3  three  dual  forms  were  introduced  for  {P.l};  the  Wolfe 
dual  {W.l},  the  generalized  inverse  dual  {D.l},  and  the  conjugate  dual 
{CD.l}. 

Since  {W.l}  lacks  many  desirable  properties  which  {D.l}  and  {CD.l} 
possess  the  analysis  of  this  chapter  will  address  the  question  of  whether 
{D.l}  or  {CD.l}  offers  computational  advantages  for  either  or  both  of  the 
two  subclasses  of  {P.l}. 

The  two  subclasses  of  {P.l}  will  be  defined  as  definite  problems 
and  semideflnite  problems.  Definite  problems  will  be  those  for  which 
*  (x)  is  strictly  convex  or  those  which  have  a  known  active  strictly 

O     .    ■     : 

convex  constraint.  Semideflnite  problems  will  be  those  which  are  not 
definite. 

The  comparison  analysis  will  be  subjective  in  nature,  but  will 
catalog  those  critical  elements  of  algorithmic  development  which  may 
likely  prove  to  be  advantageous  or  not  computationally. 

To  this  date  there  has  been  one  reported  algorithm  developed  for 
{CD.l}.  This  work,  done  by  Ecker  and  Nlemi  [24],  reported  one  experi- 
mental result  based  on  a  definite  problem,  but  offered  no  comparison 
information  with  other  algorithmic  approaches. 
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For  intermediate  scale  definite  problems,  primal  problems  of  40 
variables  and  30  constraints,  a  projected  gradient  algorithm  was 
developed  for  {D.3}  and  reported  on  by  Hearn  and  Randolph  [35 ]• 

Experimental  results,  CPU  time,  obtained  with  this  algorithm  are 
catalogued  in  Table  1.  The  problems  were  randomly  generated,  had  100 
percent  dense  matrices  with  positive  components  between  zero  and  ten, 
and  strictly  convex  objective  functions.  For  each  n,m  three  problems 
were  generated  and  solved.  Also,  a  general  primal  algorithm,  the 
Sequential  Unconstrained  Minimization  Technique  (SUMT)  of  Fiacco  and 
McCormick  [28]  as  implemented  by  Mylander,  Holmes  and  McCormick  [43 ] . 
was  applied  to  three  of  the  primal  problems.  The  first  problem  n  =  5, 
m  =  3  had  a  CPU  time  of  2.6  seconds;  the  second  n  =  10,  m  =  5  had  a  CPU 
time  of  11.9  seconds  and  a  third  problem,  n  =  15,  m  =  10,  was  attempted, 
but  terminated  after  120  seconds. 

4.2  Algorithm 

To  compare  the  duals,  {D.l}  and  {CD. 1},  the  projected  gradient 
algorithm  will  be  used  as  the  standard  for  investigation.  A  discussion 
of  the  projected  gradient  algorithm  is  found  in  Rosen  [59  »60].  A 
discussion  of  feasible  direction  approaches,  which  includes  gradient 
projection,  is  found  in  Zoutendijk  [72]. 

The  basic  form  of  the  algorithm  is  as  follows,  stated  for  {D.l}: 

(a)  At  a  feasible  point  y  compute  the  gradient  of  the  objective 
function,  V((;(y  )• 

(b)  Determine  the  matrix  T,  defining  the  support  of  the  constraint 
set  at  y^.  The  rows  of  T  are  the  gradients  of  the  constraints 
active  at  y  (or  a  subset  of  the  active  constraints). 
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(c)  Conpute  the  projection  matrix  P,  which  projects  vectors  onto 
the  null  space  of  T. 

If    V+1     If     k  k  k+1 

(d)  Compute  X^,  y^^   =  y  +  X  P  Vil^Cy"),  such  that  ip(y''  n   is 

k  k 

maximized,  for  P7iJ^(y  )  ?«  0,  and  go  to  (b).   If  PVi|;(y  )  =  0 

k 
and  the  orthogonal  projection  of  7i|<(y  )   onto  T,    the  null  space 

k 
of  P,  results  in  nonpositive  components  then  y  is  optimal; 

otherwise  select  the  largest  positive  component  of  this  pro- 
jection and  remove  the  associated  equality  constraint  from  the 
set  of  active  constraints  and  go  to  (b). 

Rosen,  in  defining  the  support  matrix,  required  the  selection  of  a 
linearly  independent  set  of  active  constraint  gradients  for  the  rows  of  T. 
Other  authors  have  used  the  definition  of  a  regular  point  to  meet  this 
requirement,  for  example,  see  Luenberger  [42].  This  particular  concept, 
while  theoretically  acceptable,  creates  computational  difficulties.  The 
solution  to  this  difficulty  is  obviated  by  the  use  of  the  generalized 
inverse  of  T.  That  is,  by  Corollary  A. 8.1  the  projection  onto  the  null 
space  of  T  is  defined  by  the  matrix  (I-T~T)  and  the  projection  onto  the 
null  space  of  P  by  T~T. 

Therefore,  in  comparing  the  two  dual  forms  the  generalized  inverse 
of  the  support  matrix  will  be  derived. 

Other  necessary  computations  for  the  algorithm  involve  finding 
gradients  of  the  objective  functions  and  solving  the  one-dimensional 
maximization  problem. 

An  important  consequence  of  both  {D.l}  and  {CD.l}  with  respect  to 
feasible  direction  algorithms  is  that  initiating  such  an  algorithm  at  a 
dual  feasible  point  will  be  sufficient  to  insure  that  no  infeasible 
dual  solutions  are  generated.  For  {D.l},  Theorem  3.8  and  Corollary  3.8.1 
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are  seen  to  prove  this  statement. 

Therefore,  use  of  the  projected  gradient  algorithm  generates  feasible 
directions  and  the  one-dimensional  maximization  will  result  in  dual 
feasible  solutions. 

4.3  Projection  Matrices 

The  support  matrices  of  the  two  dual  problems  will  be  determined  and 
their  null  space  projectors  defined.  To  distinguish  between  the  two, 
a  sub-D  will  designate  matrices  for  {D.l}  and  a  sub-c  those  corresponding 
to  {CD.l}. 

To  simplify  the  notation  assume  thkt  the  dual  variables,  y,,y, ,... ,y  , 

■  i   2       S 
B  <  m,  are  zero  and  the  active  constraints  {D.l}  are  arranged  as 

^1  =° 

(i-Q"(y)Q(y))Hy  =  0, 
where  y  e  F.  The  matrix  defining  the  support  manifold  at  the  intersection 
of  these  active  constraints  is  given  by 


"^0  = 


.i__J.o_ 

(l-Q-Q)H 


where  I  e  E®^^,  0  e  e^^^^~^\    (I-q"Q)H  e  e""°,  and  it  is  understood  that 

H  =  (h,,h.,...,h  ),  the  h  column  being  removed  due  to  y  being  constant 
1  2      m       o  o 

for  all  ye  F.  T_.  is  seen  to  be  an  (s+n)xm  matrix  and  its  null  space 
projection  matrix  is  defined  as 
p  =  f I-T~T  ) . 

By  Theorem  A. 15 


tJtJ"  =  uu~  +  cc~ 

where  U  =  [-i-]  e  E°"^,  V  =  fl'^d-Q'Q)  e  e"^,  and  C  =  (I-UU")V. 

By  Corollary  A. 3.4  U~  =  [1,0]  and  UU~  =  [-5-J-5-]  e  e'"™,  I  e  E^^^. 

.1 
Therefore, 


C  =  (I-UU~)V  =  [-~|-~]  H^I-Q-Q),  I  e   E^"'-^^^^'"-^> 


,  and 
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partitioning  H,  H  =  (H  ,H  )  where  H  =  (h  ,h  ....,h^)  and  H  =  (h  ..,h  .., 

So  S        12        S,.     S'      S'T'X   S "  Z 

. . . ,h  )  it  is  seen  that 


C  =  [ 


h^(i-q"Q) 


]  e  e' 


,  0  e   E°^". 


Again  applying  Corollary  A. 3. 4  results  in 


CC 


_0  |_0_ 

0!  H^(I-Q"Q)[Hg(I-Q"Q)]' 


and 


H^(I-Q~Q)[H^I-rQ)]"  =  [(I-rQ)H„]"  H  -[(I-Q""Q)H  1~Q"QH„ 

o  a  Do'  S         0 

which  reduces  to 

[(I-Q"Q)Hg]"  Hg 
since,  by  Theorem  A. 16,  [ (I-q"Q)H  ]"Q~  =  [Q(I-Q"Q)H  ]~  =  0. 


Combining  the  above. 


0  I   0 


0|l-[(I-Q-Q)Hg]^Hg 


and  {I-[(I-Q  Q)Hg]  Hg}  e  E 


(m-s)x(m-s) 


Hence,  applying  the  projected  gradient  algorithm  to  {D.l}  it  is 
only  necessary  to  work  with  the  duial  variables  which  are  positive. 


i.e.,  y.  such  that  j  e  W(y). 

Now  turning  to  {CD. 1},  again  assume  that  y  ty  t...,y  =  0  and  the 

12s 

set  of  active  constraints  is 
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y,  =  o 


I  bJz  -  I    h  y  =  0. 

j=0  ■'  -^  j=o  -"^ 


From  this  the  matrix  defining  the  support  manifold  of  active 
constraints  at  the  point  y  is 


T  =  1 a 

-H  I  b'^ 


e  E 


(s+n)x(m+m) 


;;t 


.t  „t 


where  le  E^^^,  m=  (iiri-l)n,  b''  =  (B^.B^  , . . .  ,B^),  and  H  =  (h^  .h^ h^)  , 

The  null  space  projection  matrix  for  {CD.l}  is  then 

P  =  (I-T"T  ) . 
c      c  c 

Applying  Theorem  A. 15, 


T*^T*^"  =■  UU"  +  CC" 
c  c 


with  U  = 


and  C  =  d-UU") 


-H 

■"b" 


Corollary  A. 3. 4  gives 


UU  = 


.I.].?.. 

0  !  0 


resulting  in 


C  = 


and  CC 


s 


L  B 


,  Oe  E^ 


[■            1 

-| 

0    1           0 

0  1 

s 

8 

.  B  . 

-B  - 

_ 

57 


Therefore, 


r  I 
0  ! 


LO  j  I-(-H^,b'')"(-H  .B*^)  J 


e  E 


(nrhn)x(nrhn) 


where  [I-(-Hg.B'^)"(-Hg,B*^)]  £  E^"^-^^^^"^-^\   Further  use  of 

Theorem  A. 15  will  not  simplify  this  form,  but  again  it  is  seen  that  the 

projection  matrix  P  depends  on  the  positive  y.  and  not  on  the  dual 

J       . 

variable  vectors  z.. 


4.4  Objective  Ftmction  Gradient 

The  gradient  of  ^((y)  is  given  in  Theorem  3.8.  For  {CD.l},  the 
objective  function  is  , 


T(y,z)  =  -  -x-    2,   v  z.z,  +  c  y 
^  jeW(y)  ^j  ^  ^ 


and  its  gradient  is 


VT(y,z)  =    --. 


9T 

3y 

3T 

3y 

3T 

3y 
•'m 

V  T 
z 

0 

V  ,  T 
^1 

m_ 

3T    I  _1   _t 


^^"^  "^  =  2  7F  Vk  •*•  '^k^^^  ^  =  1.2...., m. 


V,  T 


\  ^°^   ^  °  0,1,..., mj  defined  only  for  y.  >  0. 
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4.5  Definite  Problems 

By  definition  of  definite  problems  Q(y)  is  nonsingular  and  by 
Corollary  3.1.2  {D.l}  specializes  to  {D.3}. 

It  is  clear  that  for  {D.3};  F  =  F*.  the  selection  of  an  initial 
y  eri  F  is  trivial,  and  the  optimal  dual  solution  completely  defines  the 
optimal  primal  solution. 

For  {CD. 1}  the  major  benefit  of  the  definite  problem  is  in  the 
determination  of  an  initial  feasible  solution  point.  That  is,  say  Q^  is 
positive  definite,  then 

-1  m 


z  =  -  B^ 
o     \   o 


''1    I    [h.y.  +  bS,], 


arbitrarily  selecting  y^.z^  >  0  for  j  =  l,2,...,m,  computing  z^,  and 
recalling  that  y  =  1,  results  in  an  initial  feasible  dual  solution. 

In  terms  of  applying  a  projected  gradient  algorithm  it  is  seen  that 
for  {D.l}  the  projection  matrix  reduces  to 


'  0  |_0' 
_  0  I  I. 


e  E 


mxm 


where  I  e  E^'^^  but  in  contrast,   for  {CD.l},  using  Theorem  A.6, 


P     = 
c 


.O-i. 


I  - 


LSTVs  +  5(^»''^-»s'^'). 


where  Q(l)  =     I     Q.- 
j=0    -^ 

One  means  of  comparing  algorithms  is  to  calculate  the  number  of 
multiplications  required.  This  basis  of  comparison,  while  not  foolproof, 
can  be  used  as  an  indicator  of  relative  computing  requirements. 
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Table  2  details  the  multiplications  for  {D.l}  and  {CD. 1}  applied  to 

definite  problems.  For  convenience  it  has  been  assumed  that  y, ,y„f...,y  =  0 

for  both  {D.l}  and  {CD.l}  and  that  the  Choleski  method  of  matrix  inversion 

was  employed  (see  Clasen  [15]  and  Westlake  [70]).  Also,  it  is  assumed  that 

for  a  given  primal  problem  {D.l}  and  {CD.l}  will  be  solved  in  the  same 

number  of  iterations. 

An  initial  observation  to  be  made  is  that  the  determination  of  P  and 

■  c 

P  vT(y,z)  contributes  the  majority  of  the  iteration  operations  to  {CD.l}. 
This  implies  that  if  it  were  known  that  the  optimal  solution  was  in  the 
relative  interior  of  F  ,  the  feasible  region  of  {CD.l},  then  the  iteration 
count  for  {CD.l}  would  be  drastically  reduced  and  other  considerations 
relative  to  the  two  duals  would  come  into  play. 

Another  case  where  the  multiplication  counts  are  significantly 
different  is  quadratic  programming.  Here  the  iteration  counts  for  both 
duals  are  reduced.  That  is,  for  {D.l}  it  is  only  necessary  to  compute 
one  inverse  at  the  initiation  of  the  algorithm  and  for  {CD.l}  the  order 
of  P  is  reduced  from  (iiri-l)n+m-s  to  n+m-s.  The  cumulative  result  of 
these  reductions  though,  is  that  while  the  order  of  P  is  reduced  it  is 
still  necessary  to  inyert  an  nxn  matrix  for  each  iteration  not  in  the 
relative  interior  of  F  and  {D.l}  still  offers  significantly  fewer 
multiplicative  operations. 

For  the  general  case,  m  proper  quadratic  constraints,  it  is  clear 
that  for  noninterior  point  solutions  {D.l}  incurs  fewer  multiplicative 
operations  than  does  {CD.l}. 

Other  characteristics  to  be  considered  are  the  relative  size  of  the 
two  duals  and  associated  computer  storage  requirements.  {D.l}  has  ip  dual 
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variables  whereas  {CD.  1}  has  J^     P.  +  m  dual  varialbes,  p  being  the 
rank  of  Q  .  For  example,  if  n  =  50  and  m  =  25,  assuming  the  average 
p  =  25,  {D.l}  would  require  25  dual  variables  while  {CD.l}  would  require 
650  dual  variables,  26  times  as  many.  Computer  storage  requirements  will 
differ  in  a  comparable  fashion  also.  That  is,  even  though  {D.l}  needs  to 
store  the  same  number  of  matrices  as  {CD.l},  the  Q.  and  B  ,  it  will  be 
necessary  to  maintain  storage  for  the  matrix  P  ,  which  will  be  of  order 
1300  X  1300,  and  the  matrix  of  z.  dual  variables. 

In  conclusion  then  for  definite  problems  the  generalized  inverse 
dual,  {D.l}  specialized  to  {D.3},  should  prove  to  be  favorable  over  the 
conjugate  dual,  {CD.l},  computationally, 

A. 6  Semidefinite  Problems 

The  coiiq>arison  of  {D.l}  to  {CD.l}  for  semidefinite  problems  is  not 
so  clear  cut.  The  major  difficulty  is  that  of  determining  an  initial 
feasible  dual  vector,  preferably  in  the  relative  interior  of  F.  This 
problem  is  common  to  both  dual  forms  and,  therefore,  will  only  be  dis- 
cussed as  related  to  {D.l}. 

The  ideal  initial  dual  vector  for  {D.l},  as  stated,  would  be  one  in 
the  relative  interior  of  F.  The  following  are  classifications  for  which 
such  an  initial  vector  can  be  readily  determined. 

(a)  For  some  k  e  {0,1,2, ...  ,m},   $j^  is  strictly  convex. 

This  implies  that  Q.  is  positive  definite  and  F*  is  the  non- 
negative  orthant.  The  initial  point,  y  ,  can  be  selected  such 
that  y^  >  0  for  j  =  l,2,...,m,  y  e  ri  F. 

(b)  For  j,k  =  0,l,2,...,m,  h  is  in  the  column  space  of  Qj^.  Again 
F*  is  the  nonnegative  orthant  and  y  can  be  selected  such  that 
yj  >  0,  j  =  1,2,. ..,m,  y  e  ri  F. 
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The  next  set  of  classifications  determine  an  Initial  dual  vector  y 
In  F,  either  a  relative  boundary  or  relative  Interior  point. 

(c)  h  Is  In  the  column  space  of  Q  ,  then  y.  =  0  for  j  =  1,2, ...,m 

will  result  In  y  e  F.  Also,  If  there  Is  an  Index  set 

I  =  {0,j  ,j j  },  such  that  for  k  e  I  ,  h.  Is  In  the  column 

r      1  2     r  r   K 

1  1 

space  of  some  Q . ,  j  e  I  ,  then  y,  >  0  for  k  e  I  ,  y,  =  0  for 
J       r       K  r   fc 

1  .  ■  ■ 
k  ^  I  results  In  y  e  F. 

(d)  There  exist  y  >  0  such  that  Hy  =  0.  This  clearly  results  In 
y  e  F.  Hy  =0  and  y  >  0  can  be  expressed  by  the  system 

!,Vj'--V  (4.6.1) 

j=l  "'  "^ 

yj  >  0  (4.6.2) 

To  facilitate  recognition  of  this  characterization  It  Is  noted  that 
by  Farkas'  theorem  [43]  there  will  either  exist  a  solution  to  the  system 
(4.6.1),  (4.6.2)  or  to 

h^x  <  0  for  j  =  1,2, ...,m  (4.6.3) 

-h*^x  <  0  (4.6.4) 

o 

but  not  to  both.  The  system  (4.6.3),  (4.6.4)  Is  readily  recognized 

through  a  linear  programming  formulation.  By  solving 

{LP.l}   minimize     x(x)  =  -h^x 

subject  to:  h.x  <  0 

for  j  =  l,2,...,m 
a  solution  to  the  system  (4.6.3),  (4.6.4)  Is  determined  to  exist  or  not. 
That  Is,  If  the  optimal  objective  function  value  for  {LP.l}  Is  nonnegatlve 
then  (4.6.1),  (4.6.2)  has  a  solution  which  Is  In  F. 
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Obviously,  not  all  problems  will  fall  into  the  preceding  categories 
and  even  if  they  did  the  recognition  of  a  particular  category  is  not 
especially  straightforward.  The  introduction  of  the  following  technique 
is  intended  to  offer  one  approach  to  alleviating  the  difficulty  by  using 
an  extension  of  the  'Big  M'  method,  see  Charnes  [l2].  of  linear  pro- 
gramming which  was  also  designed  for  determining  initial  feasible 
solutions. 

Taking  {P.l},  add  the  nrfl^*^  constraint 
*ji^,  (s)  =  J  x*^  I  X  -  M  <  0 

with  M  a  large  positive  number.  {P.l}  with  this  constraint  will  be 

referred  to  as  the  auxiliary  problem,  {AP.l}.  Clearly,  *_. , (x)  is 

nrri 

strictly  convex  and  thereby  {AP.l}  falls  into  classification  (a)  for 
selecting  an  initial  feasible  solution  to  the  dual  auxiliary  problem, 
{AD.l}.  The  objective  function  of  {AD.l}  is 

1''(y)  =  -  |y*'H^Q"(y)Hy  +  cy 

where 

y"^-  (y^.y  ...-.y^.ynri-i^' 

H  =  (h  .,h  ,...,h  ,0), 
o       m 

Q  =  Q,  +  j^  yjQj  +  y^i  I. 

and     c  =  (c  »c c  ,-M).  Maximizing  ip' (y)  using  an  initial 

feasible  solution  containing  y_.i  >  0  will  lead  to  a  feasible  solution 
of  {D.l},  driving  y^_j^  to  zero,  if  such  a  feasible  solution  exists.  As 
in  the  two  phase  method  of  linear  programming,  see  Dantzig  [  20] ,  when 
y^_-  is  reduced  to  zero  continued  iterations  should  be  performed  using 
{D.l}  instead  of  {AD.l},  since  y^.i  will  never  be  made  positive  again 
due  to  the  influeince  of  -M  on  the  objective  function. 
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The  above  technique,  while  not  particularly  elegant,  is  simple  to 
implement  and  will  determine  a  feasible  solution  to  {D.l}  or  indicate  that 
{D.l}  is  infeasible. 

Another  potentially  useful  technique  for  solving  semidefinite  problems 
is  again  based  on  an  auxiliary  problem  which  is  definite.   The  theoretical 
basis  of  the  approach  is  found  in  Fiacco  and  McCormick  [28 ,  Theorem  37]. 
In  essence,  the  proof  is  that  if  a  sequence  of  {P.l}  problems  is  solved 
with  objective  functions 

e.  >  0  and  limit  e .  =  0 
i  ->  00 
then  the  solution  to  the  original  {P.l}  is  the  limit  x^  as  i  -»■  «..  This 
approach  is  computationally  advantageous  in  that  each  problem  in  the 
sequence  is  definite  and  can  be  solved  using  {D.l}  with  no  additional 
dual  variables  required.  The  disadvantage  is  the  requirement  to  solve  a 
sequence  of  problems. 

For  semidefinite  problems  both  {D.l}  and  {CD.l}  involve  the 
computation  of  generalized  inverses.   In  recent  years,  a  sizable  litera- 
ture has  developed  relating  to  generalized  inverse  computational  methods, 
but  experimental  results  have  yet  to  indicate  preferable  methods  for 
classes  of  matrices.  Rust,  Burrus  and  Schneeberger  [62]  have  developed 
a  FORTRAN  program  based  on  Gramm-Schmidt  orthogonalization  while 
Greville  [33]  and  Tewarson  [67]  have  developed  methods  for  special 
partitioned  matrices .   Computational  methods  based  on  Gauss-Jordan  elimi- 
nation have  been  proposed  by  Ben-Israel  and  Wersan  [6  ]  and  Noble  [49]; 
whereas  Pyle  [56]  has  a  method  enqploying  a  gradient  projection  method. 
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Also,  Hestenes'  [  36]  blorthogonallzatlon  technique  has  been  extended, 
see  Bouillon  and  Ode 11  [  8],  to  computing  generalized  inverses  and 
Decell  [23]  and  Ben-Israel  and  Charnes  [5  ]  have  an  approach  which  is 
based  on  the  Cay ley-Hamilton  theorem. 

The  question  of  what  technique  might  prove  best  for  evaluating  the 
objective  of  {D.l}  is  a  numerical  one  which  requires  further  research. 


CHAPTER  5 
APPLICATIONS 

5.1  Introduction 

The  Intent  of  this  chapter  Is  to  present  some  areas  of  application 
In  which  the  generalized  Inverse  dxial,{D.l},  should  prove  useful. 

It  is  demonstrated  that  the  generalized  inverse  dual  of  some  dual 
forms  results  in  the  solution  of  the  primal  problem  which  was  nondif f er- 
entiable  on  the  primal  feasible  space.  Also,  it  is  seen  that  such 
problems  evidence  other  simplifications  in  the  generalized  inverse  dual 
form. 

To  simplify  the  notation  for  presentation,  some  assumptions  have 
been  made  which  could  be  relaxed  for  discussing  these  applications  in 
detail. 

5.2  Multifacilitv  Euclidean  Distance  Location  Problem 

This  problem  is  one  of  locating  in  E  N  new  facilities  such  that 
the  maximum  weighted  Euclidean  distance  from  M  existing  facilities  is 
minimized. 

The  problem  is  a  minimax  version  of  the  well-known  Fermat  problem 
which  is  addressed  by  Kuhn  [39],   (For  additional  references,  see  Cabot, 
Francis,  and  Stary  [lO].)  Other  distance  measures  for  the  problem  have 
also  been  considered;  most  notably  the  rectilinear  measure  considered 
by  Bearing  and  Francis  [22]. 
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The  primal  Euclidean  distance  minimax  multifacility  location  problem 
is: 

{M.l}    minimize  maximum  (w   | |x  -a^| |   for  all  j  and  i; 

^'^'••••^  Ujkllxj-xj^ll^  for  1  <  j  <  k  <  N} 

where: 

a.  e  E    represents  existing  facility  site  for  i  =  l,2,i..,M. 

X  e  e"   new  facility  site  for  j  =  1,2,...,N. 

w,.:  e  E   given  nonnegative  weight  (squared)  representing 
interaction  between  new  facility  x .  and  existing 
facility  a .  for  all  j  and  i . 
u  ,  e  E   given  nonnegative  weight  (squared)  representing 
interaction  between  new  facilities  x.  and  x.  for 
1  <  j  <  k  <  N. 
11*11   Euclidean  norm  function. 
{M.l}  can  be  written  as  a  constrained  nonlinear  programming  problem 
by  letting  the  variable  s  represent  the  minimand; 

{M.  2 }    minimize     s 

subject  to:  w  , ||x  -a  ||  -s  <  0 

for  j  =  1,2,...,N  and  i  =  1,2,...,M, 

Ujkll^j-^ll'-«f  0 
f or  1  <  j  <  k  <  N 

To  facilitate  the  presentation  it  will  be  assumed  that  w, ,,  j  =  1, 

2,.  ..,N  and  i  =  1,2,...,M,  are  all  positive  and  that  u.^^,  for  all 

1  <  j  <  k  <  N  are  also  positive.   This  assumption  can  be  replaced  by  a 

chaining  assumption,  see  Cabot  and  Francis  t  9]  for  discussion,  which 

insures  the  problem  being  well  formulated. 
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Using  the  Euclidean  norm  squared  leads  to  the  following  interpreta- 
tion of  {M.2}, 


{M.3}    minimize     p  x 


subject  to:  -j  x*^D  x  +  h^  x  +  c  <  0 


for  j  =  1,2 N  and  i  =  1,2,...,M, 


2  ^^''^jk^  •*■  Pjk'^  <  0 


(5.2.1) 


(5.2.2) 


for  1  <  j  <  k  <  N 

t   't  -.nNfl 

...,Xjj,s)   e  E 

(xi.,x  x^.)  e  e", 


where  now  x  =  (xj.x^ ^»s)*^  e  e"^-"", 


'Ij'  2j 
.t  ^t 


nj 


t  «t 


.jl  =  (0,,0^..,..o;.^,-2aJ,o;^^ Oj.  --i-) 


with  0.  the  zero  vector  in  E", 


„nN+l 


^jk 


^"l'"2."--'°N»    u,/   ^  ^ 


Po=  <0['02--"0j.l)\ 

c^  =  a^a^  e  E  ,  D  and  D  ,  are  matrices. 

Let  D  be  an  (nN+l)x(nN+l)  matrix  represented  by 


D  = 


where  d,,  e  E^  I  e  e"™  and  0  e  e". 


d     I 
11 

d     I      . 
12 

0 

d2il 

d22l      . 

•      ^2nI 

0 

V 

V      ' 

•   W 

0 

0^ 

0^           . 

.      0*^ 

0 

ij 
defined  as  follows; 


The  matrices  D^  and  D.,  are 
j      jk 


D.  =  D  such  that  d^^=  2  and  all  other  d.   =  0, 

°jk  *  °  ^"*^^  ^^^^   *^jj  °  \k  "  ^'  *^ik  "  *^ki  "  "^  ^^^  ^^^   other 
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Clearly,  {M.3}  Is  a  quadratically  constrained  quadratic  programming 
problem  and,  by  definition  of  the  Euclidean  norm,  D  and  D   for  all 
j  and  k  are  positive  semidefinite. 

Let  the  dual  variables  associated  with  the  primal  constraints  (5.2.1) 
be  designated  by  y..  and  those  associated  with  (5.2.2)  be  designated  by 

Hence,  by  Theorem  3.1,  the  dual  of  {M.3}  is 


{MD.l}   maximize     i);(y) 


N   M 
yVQ'Hy+  I       I     y  ,c 
j=l  i=l  J^  ^ 


1  -t„tA-„- 
2 


subject  to:   y  >  0,  y  e  E 

-   ,    t  t.t 

y  =  (y^j.y  .v  ) 

y  =  1 
■'o 


,NM+[N(N-l)/2]+l 


(I-QQ  )Hy  =  0 

N   M 


N-1   N 


where 


It  is  noted  that 


j-l  1=1     J^  J       j=l  k=j+l     ^'^  J"^ 
H  =   (Po.h^^,h^2 ^gM'Pl2""*PlN'P23"*->PN-l,n^' 


Hy 


-I     ^li^ 
i=l     ^ 

M 

-  I  y2±\ 
1=1    ^^  ^ 


N       M  N-1       N 

1-  r  I  bi-  I    I  rjk 

j=l  i«l  Wj^       j=l  k=j+l  uj^ 
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Also,  by  defining 
M 


N        j-1 
3  J   1=1  J^   k=i+l  ^^   m:*l  ""^ 


(5.2.3) 


for  j  =  1,2,...,N,  using  the  convention  ^   =  0; 

k=j+l 
and 

a^.  =  -v^.  for  1  <  i  <  j  <  N, 


"jl  =  °lj 


(5.2.4) 
(5.2.5) 


It  Is  seen  that 


Q  = 


0  '  0 


where  Q  =  Ix2A,  A  =  (a. .)  e  E   ,  I  e  E   ,  Is  defined  as  a  Kronecker 
product.  By  Corollary  A. 3. 4, 


Q  = 
and  by  Theorem  A. 12 


Qi]  0 

0  I  0 


Q^  =  I  X  I  A~. 


Therefore, 


F  =  {y  >  0|y^  =  1, 


l2lxAA__|_0_ 

o'   I  1 


Hy  =  0}. 


Looking  at  the  matrix  A  It  Is  noted  that  by  setting  all. dual 

variables  positive  and  applying  the  following  known  theorem  of  linear 

algebra,  stated  here  as  a  lemma  with  proof  omitted,  it  is  seen  that  A 

has  rank  N. 

N 
Lemma  5.1.  If  A  is  a  symmetric  NxN  matrix  and  if  o..  >  J   l^'n  I 


for  i  =  1,2,...,N,  then  A  is  a  positive  definite  matrix. 


iH 
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Therefore,  A  Is  nonslngular  and 
F*  =  {y  >  0|y 


N   M  y^^    N-1   N   v., 


=  1} 


j=l  i=l  "ji   j=l  k=j+l  "jk 
with  an  Index  maximal  dual  feasible  vector  being  one  which  has  all 
components  positive.  The  selection  of  an  initial  solution  in  rl  F*  -  ri  F 
is  trivial. 

Furthermore,  for  any  feasible  vector  y  it  is  seen  by  (5.2.3),  (5.2.4) 
and  (5.2.5)  that  if  a.,  is  zero  for  a  particular  j  then  the  elements  of 
the  associated  row  and  column  are  zero.  Therefore,  by  an  elementary, 
orthogonal,  row  and  column  transformation,  E,  A  can  be  expressed  as. 


1 

0   !  0 


where  A   has  positive  diagonal  elements. 
It  can  then  be  shown  that 


A  =  E 


.V|°. 


and  by  Corollary  A. 3. A 


A  =  E 


A"^  I  0 
_0   I  0_ 


E. 


Often  it  will  hold  that  A   is  also  positive  definite,  in  which 

case,  A   =  A   and  the  dual  is  further  simplified. 

11    11  ^ 

Thus,  it  is  seen  that  if  ij>(y),  a  concave  objective  function,  is 
maximized  by  application  of  a  feasible  direction  algorithm  (feasible 
directions  to  F)  over  F*»  a  linearly  constrained  region,  then  such  a 
solution  also  solves  {MD.l}.  The  computational  difficulties  of  solving 
this  dual  are  of  size  (dimension  of  Q  and  H)  and  not  complexity  as  the 
foregoing  demonstrates. 
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The  analysis  of  this  section  can  also  be  extended  to  Include  multl- 
faclllty  problems  with  constraints.   This  would  encompass  some  type  of 
constraint  on  the  weighted  distance  between  any  two  facilities,  either 
existing  or  new.  For  a  discussion  of  this  problem  see  Bearing  [21]. 

5.3  Sinha  Duality 

Another  application  of  {D.l}  is  in  solving  Sinha 's  [55]  problem. 
Sinha  has  established  a  dual  for  one  formulation  of  the  stochastic 

programming  problem.  The  primal  and  dual  forms  of  Sinha  are 

m  1/0 

{SP.l}       maximize  *   (x)  =  d*^x  -     I     (x*^Q.x)   ' 

°  j=l  ^ 

subject  to:  Ax  <  b 

■    ;    X  >  0 

where  d,  x  e  E  ,  Q.  e  E    and  symmetric  positive  semldefinite,  A  e  E   , 
b  e  E® 

{SD.l}   minimize     Ji(z)  =  b'z 


t     "     i 
subject  to:  A  z  +  J  Q.w^  > 

j=l  ^ 


w^^Q.w^  <  1,  for  j  =  l,2,...,m 
z  >  0 
where  w""  e  E  for  j  =  l,2,...,m  and  z  e  E  . 

Making  the  following  notatlonal  definitions, 

t   ,  t   1   2       m  V   _,s+mn 
w  =  (z  ,w  ,w  , . . .  ,w  )  e  E    , 

5*^=  (bSo.0 0)  eE«-*™ 

-t   /»t'  t  ^t     -t.   „nx(s+mn) 
Q  =  (A  ,Qi,Q2.---.Q„,)  e  E  ^ 

Q  e  E^®"*™^^^^^"^™^^  an  nH-1  block  diagonal  matrix  such  that  there 

'  St 

is  first  an  sxs  zero  matrix  and  then  m  nxn  matrices,  the  j+1 
matrix  on  the  diagonal  being  Q  ,  all  others  zero. 
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Q^=  (1.0,0...., 0)  eE^''^^'*^\ 
Sinha's  dual  is  now  given  by. 


{SD.l'}  minimize     fi(w)  =  b*^w 

subject  to:  —  w  Q  w  -  j  <  0 

for  j  =  1,2 m 

-q'^w  +  d  <  0 

-Q^w  <  0 
^o  - 


Defining  y*=  =  ^yo'^m'^'y^  ^  e"^"^^"^^  where  y^  =  1.  y^  e  E^ 

n  s 

y^j  e  E  ,  and  Yg  e  E  results  in  the  following  generalized  inverse  dual. 

,   m 
{D.5}    maximize     ,j;(y)  =  -  -  {  j;   [y^Q .  (y„  Qj"Q.y„+y„  ]}  +  dV 

subject  to:  A  y  +  I   y  =  b 


n    sxs  s 


for  j  =  l,2,...,m. 


y  >  0 
•'m  - 

y  >  0 
n  - 

y  >  0 

■'s  - 


where  it  is  seen  that  the  vector  y  can  be  dropped  and  the  first  linear 
equality  converted  to  a  linear  inequality,  A^  <  b. 

It  is  first  observed  that  while  Sinha's  dual  has  s+mn  variables. 
{D.5}  has  only  mfn.  Furthermore.  ip(y)  can  be  rewritten,  based  on  the 
properties  of  the  generalized  inverse  and  Corollary  A. 3. 5.  as 

*^0 
'     UN(yJ         ^mj     -j 
where  W(y^)  =  {j|y   >  0},  with  Q.y^  =  0  for  j  ^^  W(y  ). 
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The  gradient  of  i|)  (y)  for  any  feasible  point  is  likewise  available 

by  Theorem  3.8  and  recalling  that  if  (y  QJ  is  the  zero  matrix,  then 

mj  j 

its  generalized  inverse  is  also  a  zero  matrix.  An  initial  feasible 
point  for  {D.5}  can  be  derived  by  solving  A^  <  b,  y  >  0,  and 
setting  y^  =1  for  all  j  =  l,2,...,m.  Hence,  solution  procedures  for 
{D.5}  will  require  no  matrix  inversion  computations  and  therefore  is 
computationally  tractable. 

The  merit  of  {D.5}  is  exemplified  in  the  next  theorem. 

Theorem  5.1.  Given  that  (y°^ »y°^)^   solves  {D.5}  then  x°  =  y° 

[ —  'n   m  ■'n 

solves  {SP.l}.  Also,  if  x°  solves  {SP.l}  then  (x°^ ,y°^)   solves  {D.5} 

m 

,     o    -  ot-  0.1/2 
where  y   =  (x  Q.x  )   . 

j       J 
Proof  of  Theorem  5.1.  Clearly  x°  =  y°  is  feasible  to  {SP.l}. 

Also,  for  jeW(y  ) 

and     y°  =  (y°V°)^^^. 
Substituting  into  i|;(y), 

jeW(y°)   "  f " 

where  it  is  recalled  that  if  i^W(y°)-  Q^y°  -  0.  Then  *  (x=y°)  >  *  (x) 

m '  j'n  o    n  -  o 

for  all  feasible  x  to  {SP.l};  assuming  otherwise  leads  to  a  contradiction 

on  y  being  optimal  to  {D,5}. 

Now,:  assuming  that  x  is  a  solution  to  {SP.l}  it  is  seen  that  by 

letting  y°  =  (x°Vx°)"'"'^  that  (x°*^,y^)^  is  feasible  to  {D.5}  and 

i|;(x°,y°)  =  *  (x°).  Also,  i|<(x°,y°)  >  )|;(y  ,y  )  for  (y*^,y*^)*^  feasible  to 
mo  m  ~    n  m       n  m 

{D.5}  as  otherwise  a  contradiction  results  to  x  being  optimal  to  {SP.l}. 

Q.E.D. 
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By  using  {D.5j}  instead  of  {SP.l}  the  difficulty  of  the  nondifferen- 

tiability  of  *  (x)  is  avoided  by  expanding  the  problem  dimensionality  to 
o  ' 

n+m.   This,  opposed  to  {SD.l}  having  dimensionality  of  s+mn,  should  offer 
significant  advantages  in  developing  efficient  computational  procedures. 
Also,  solutions  to  {D.5}  define  solutions  to  {SP.l}  by  the  above  theorem 
and  it  is  unnecessary  to  invoke  the  added  conqjlexity  of  computing  primal 
variables  from  dual  solutions  to  {SD.l}  which  as  Sinha  discusses  will 
often  involve  solution  of  a  linear  program. 

Obvious  application  of  {D.5}  is  to  stochastic  programming  problems, 
but  another  application  is  found  in  the  work  of  Cabot  and  Francis  [ 9  ] 
who  used  a  formulation  of  Sinha' s  dual  in  investigating  a  multifacility 
location  problem  involving  Euclidean  distances.  Again  the  difficulty  of 
the  primal  problem  Involves  the  nondifferentiability  of  the  objective 
function  at  feasible  points. 

Cabot  and  Ftancis  [  9 ]  employ  an  equality  constrained  form  of 
Sinha' s  primal  and  develop  an  equality  constrained  dual.   In  handling 
this  through  {D.5}  there  are  two  alternatives.  First,  as  per  section 
3.6,  the  linear  equality  constraint  could  be  used  to  convert  the  Sinha 
dual  to  an  equivalent  form  with  no  equality  constraints  and  then 
formulate  {D.5}.  Alternatively,  the  equality  constraint  could  be 
retained,  resulting  in  y  being  unrestricted  in  {D.5}.  Because  of 
the  necessity  to  compute  the  generalized  inverse  of  an  nx(mn+5)  matrix 
with  the  first  alternative  and,  by  Theorem  5.1,  the  fact  that  the 
second  alternative  results  in  the  primal  solution,  the  second  alterna- 
tive shotild  prove  preferable  even  at  the  cost  of  more  variables. 
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5.4  General  Fennat  Problem 

The  Fermat  problem  dates  from  the  17   Century  and  was  originally 
stated  as:  Given  three  points  in  the  plane,  find  a  fourth  point  such 
that  the  sum  of  its  distances  to  the  three  given  points  is  a  minimum. 
This  form  of  the  problem  has  been  addressed  by  several  authors;  see 
Kuhn  [39]  for  a  historical  sketch. 

The  general  Fermat  problem  is:  given  m  points  in  the  plane,  find 
the  point  which  minimizes  the  sum  of  positively  weighted  distances  to  the 

m  points. 

m 
{GF.l}   minimize     $  (x)  =  I    w  ||x-p.|| 

X  -^ 

for  w  >  0,  p  the  given  j   point  and  | |x-y| |  the  Euclidean  distance 
between  points  x  and  y. 

From  the  preceding  section  this  is  recognized  as  a  special  case  of 
the  Sinha  primal.   If  one  dualizes  the  corresponding  Sinha  dual,  as  in 
section  5.3,  a  new  version  of  {GF.l}  results,  namely 

{D.6}    minimize     ij^'(x,y)=^  I     [  (x-Pj)''(y^I)  (x-p^)  +  y^w2] 

subject  to:  y.  >  0 

[I  -  (yjl)"(yjl)](x-pj)  =  0 
for  j  =  1,2,.. .,m 
It  is  clear  that  F*  =  {(x,y)  e  E   |y  >  0}  and  an  algorithm  can 
readily  be  initiated  at  a  relative  interior  point  of  F.  Use  of  a 
projected  gradient  algorithm  for  {D.5}  is  easily  implemented  by  noting 
that  the  projection  matrix  is  diagonal  with  zero  diagonal  elements 
associated  with  y.  =  0  and  x  =  p.  and  unity  elsewhere. 
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Note  that  at  the  optimal  point,  for  y.  >  0, 

,  o       \t ,  o   ^ 
3.-(x.v)     <' -".1'  " --j'  ,  _,   „ 

which  implies  that  y°  =  w~  |;[.x  -p ,  |  |  and  as  in  Theorem  5.1,  at  optimality 

'l''(x  ty°)  =  *  (x  ). 
o 

By  the  theory  of  Chapter  3  the  function  i|''(x,y)  is  convex  and  twice 
continuously  differentiable  over  y  >  0,  x  unrestricted,  the  relative 

interior  of  F.  Furthermore,  if  some  y.  -^  0,  ij»' (x,y)  -^  +  »  unless  x  -»•  p.. 

2 
This  is  seen  by  noting  that  the  numerator  ||x-p.||   goes  to  zero  faster 

than  y..   It  is  noted  that  this  notion  applies  to  the  more  general 

multifacility  location  problem  of  Cabot  and  Francis  [9]. 


5.5  Portfolio  Selection 

There  exist  several  forms  of  this  problem;  examples  are  found  in 
Saaty  and  Bram  [63],  Markowitz  [45],  Mao  and  SBrndal  [44],  Roy  [61]  and 
Sharpe  [64]. 

The  problem  addressed  here  is 

n  n 

{PS.l}   maximize      \     y.x  (minimize  -  \     ^*^*^ 

j=l  J  ^  j=l  •'  ■' 


u     » 

ibject  to:   \       \     a,  x  x  <  a 
i=l  i-1  ^J  ^  J  " 


n 


i=l  ^ 


xj  >0 


where  a  given  sum  of  money  c  is  to  be  allocated  to  n  securities,  y.  is 
the  expected  rate  of  return  on  the  j  security,  the  nonzero  symmetric 
matrix  (a..)  is  the  covariance  matrix  of  the  random  variable  r  ,  the 

Xj  J 
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return  rate  on  the  j   security,  and  a   i-s  sl   constraint  on  the  variance 
allowable. 

The  equality  constraint  Is  equivalent  to 

X  =  (h')"c  +  (I-(h*')"h*')g,  g  e    e" 

where  h  =  (1,1 1)  e  E  each  element  being  unity.   It  Is  easily 

shown  that  (h*^)~  =  -  h,  (h')  (h^)"  =  1,  and 
n 


J  =  [I-(hV(h'')]  =-^ 


n-1    -1 
-1   n-1 

-1    -1 


-1 
-1 

n-1 


has  rank  n-1. 


Hence, 


X  =  —  h  +  Jg 
n 


and  {PS.l}  Is  transformed  to  determining  g. 


{PS. 2}   minimize     -  -  u*^h  -  u*^Jg 


subject  to:  \  g^JQJg  +  -  h^QJg  -  (o--^  (-)^h%)  <  0 
/         n  z  n       — 

-Jg  -  -  h  <  0 

n   — 

where  Q  =  2A  =  2(a^  )  e  e"™, 

u  =  (Hj.yg.'-'jyjj)  e  E  and  g  E  E  . 

{PS. 2}  has  a  linear  objective  function,  one  quadratic  constraint  and 
n  linear  Inequality  constraints. 

The  generalized  inverse  dual  of  {PS. 2}  is 


{D.7}         maximize  4»(y)   =  -  -^  y*^H*^(y^Q')~Hy  +  Z^y 
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subject  to:   y  =  (y^,y^ yn+l^^"  ^  ° 

Cy  =  0 

where  H  =  (u,  -^  Qh.I),  Q'  =  JQJ, 

C  =  [l-(yjJQJ)"(yjJQJ)]H, 
and 

<P(y)  has  been  simplified  by  the  readily  proved  identity, 
(JQJ)"  =  J(JQJ)~J. 

The  form  of  {D.7}  can  possibly  be  further  simplifed  depending  on 
the  probability  distribution  assumed  for  the  random  variables  r  ;  that 
is,  if  a  multivariate  normal  distribution  is  assumed,  A  will  be  non- 
singular  and  Cy  =  0  will  reduce  to  a  single  linear  constraint  for  y  >  0. 

{D.7}  has  a  concave  objective  function,  linear  constraints  and 
offers  computatiohai  advantages  over  {PS.l}.  The  feasible  set  of  {D.7} 
permits  the  straightforward  application  of  Rosen's  gradient  projection 
algorithm  and  it  is  only  necessary  to  compute  the  generalized  inverse 
of  Q'  =  JQJ  once. 

5.6  Convex  Programming^ 

Topkis  and  Veinott  [68]  have  stated  conditions  on  the  directions 
and  step  sizes  which  assure  convergence  to  a  stationary  point  in  feasible 
direction  algorithms  for  Kthimizing  a  real-valued  continuous  function  on 
a  closed  set.   In  particular,  for  constrained  problems,  they  state  second 
order  methods  which  require  the  solution  of  quadratically  constrained 
quadratic  programs  for  the  determination  of  feasible  directions.  See 
Topkis  and  Veinott,  Theorem  3  and  Lemmas  4  and  5. 
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Given  the  mathematical  program; 

minimize     *  (x) 
o 

subject  to:   * . (x)  <  0 

for  j  =  1,2 m 

where  the  *j(x),  j  =  0,l,2,...,m  are  convex  and  twice  differentiable. 
The  basic  second  order  method  algorithm  is  stated  as  follows  for 

the  k+1   solution  given  the  k^  feasible  solution  x^: 

k 

(1)  Is  X  optimal?   If  yes,  stop,  otherwise  continue. 

(2)  Conq>ute  feasible  direction  d  by  solution  of  the  following 
program, 

minimize     s 

subject  to:   [V  *  (xh]*^d  +  ^  d*^H(*  (x''))d  -  s  <  0 

X  O  £,  o  — 

*j(A  +  [V^t^CxSj'^d  +1  d*'H(*j(x''))d  -  s  <  0 

for  j  =  1,2,.. .,m 
d^d  <  1 


using  the  generalized  inverse  dual,  {D.l}. 
(3)  Solve  the  following  for  X^; 

minimize     *  (x^  +  Xd^) 
o        ' 

X  >  0 

k     k 
subject  to:  * ,  (x  +  Xd  )  <  0 

for  j  =  1,2 m 

k+1    k    k  k 
Put  X    =  X  +  X  d  and  return  to  Step  (1) . 


Topkis  and  Veinott,  among  others,  point  out  that  such  an  algorithm 
could  prove  superior  to  available  first  order  methods.  To  see  this  and 
gain  an  appreciation  of  the  potential  of  the  algorithm  the  following 
example  is  offered. 
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{E.3}    minimize     x 

subject  to:   (x^  +  x^)^  <  1 

The  result  of  two  Iterations  for  this  problem  using  the  proposed 
second  order  method  Is  seen  in  Figure  5-1  as  the  dashed  line.  The 
second  step  estimates  the  minimum  at  x,  =  -0.069  and  x.  =  -0.998  where  the 

actual  values  for  the  minimum  are  x  =  0  and  x  =  -1.000.  The  correspond- 

■  "■   k  '   " 
ing  results  for  the  first-order  method  (In  a  first  order  method  d  is 

obtained  in  Step  (2)  as  the  solution  of  a  linear  program.)  appear  in 

Figure  5-1  as  the  solid  line. 

The  generalized  Inverse  dual,  {D.l},  of  Chapter  3  offers  a 

computationally  tractable  nethod  of  determining  second  order  feasible 

directions  which  has  been  the  major  drawback  to  the  algorithm. 
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[1.0] 
0.0 


Figure  5-1 


CHAPTER  6 
FUTURE  RESEARCH 

In  the  foregoing,  a  viable  duality  theory  has  been  developed  for 
convex  quadratically  constrained  quadratic  programs.  The  results  of  this 
research  have  exposed  other  pertinent  areas  in  which  further  investigation 
should  prove  fruitful.  These  areas  can  be  broadly  categorized  as  theory, 
algorithm  design,  and  numerical  theory  associated  with  algorithm  imple- 
mentation. 

Attempts  at  generalizing  the  primal  problem  convex  quadratic  functions 
to  pseudoconvex  and  quasiconvex  quadratic  functions  has  proved  to  be  un- 
satisfactory. That  is,  applying  the  results  of  Martos  [46]  and  Cottle 
and  Ferland  [19]  it  can  readily  be  shown  that  the  Lagrangiah  function 
does  not  exhibit  any  usable  mathematical  structure..  This  is  not  totally 
unexpected  because,  in  general,  sums  of  pseudoconvex  or  quasiconvex 
functions  are  not  pseudoconvex  or  quasiconvex.  The  significance,  though, 
lies  in  the  fact  that  while  convex  quadratic  functions  i™part  to  the 
Lagrangian  their  mathematical  structure,  a  generalization  of  convexity 
for  quadratic  functions  negates  this  structural  transfer.  ■ 

The  research  into  which,  if  any,  subclasses  of  the  generalized  con- 
vex quadratically  constrained  quadratic  programs  evidence  a  viable 
generalized  inverse  dual  should  continue.  The  duality  theorems  of 
Karamardian  [ 38] >  for  Lagrangian  functions  which  are  strictly  quasi- 
convex  in  the  primal  variables  and  strictly  quasiconcave  in  the  dual 
variables,  could  prove  to  be  a  starting  point  for  defining  these 
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subclasses.  As  noted.  Investigations  to  date  have  beien  based  on 
structural  properties  of  the  primal  problem  and  not  on  the  dual  form. 
Hence,  one  must  determine  If  a  recognizable  mathematical  structure  on  the 
dual  form  dictates  structural  properties  for  the  primal  problem. 

Another  area  for  continued  research  Is  the  extension  of  the  concept 
of  an  Index  maximal  dual  feasible  vector  as  introduced  in  Chapter  3.   It 
was  seen  how  this  concept  led  to  some  very  Interesting  theoretical  results 
for  the  class  {P.l}  and  It  Is  possible  that  a  broader  Interpretation  and 
use  can  be  established. 

A  prelude  to  the  above  could  be  the  examination  of  minimal  dimension 
dual  spaces.   This  Idea  Is  prompted  by  the  demonstrated  equlvialence 
between  the  generalized  Inverse  dual  and  the  conjugate  function  dual.. 
That  Is ,  the  generalized  Inverse  dual  has  but  one  dual  variable  associated 
with  each  Inequality  constraint  of  the  primal  problem,  whereas  the  conjugate 
function  dual  asisoclates  n+1  dual  variables  with  each  Inequality  constraint 
and  the  generalized  Inverse  dual  space  Is  smaller  than  that  of  the  conjugate 
dual  space. 

In  a  sense,  the  property  of  taking  an  n  variable,  m  constraint 
problem  and  developing  an  m  variable,  n  constraint  dual  has  been  extended 
to  convex  quadratlcally  constrained  quadratic  programs.  Can  this  extension 
be  carried  further  to  other  subclasses  of  convex  programs,  the  dimension 
of  the  dual  space  being  determined  by  the  number  of  inequalities  of  the 
primal  problem?   If  so,  what  advantages  are  there? 

Another  area  of  particular  interest  is  the  investigation  of  the 
conjugate  function  in  relation  to  the  generalized  inverise.   Again  this 
research  is  prompted  by  the  equivalence  between  the  generalized  inverse 
dual  and  the  conjugate  function  dual.  What  is  the  detailed  relationship 
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between  the  two  concepts  for  quadratic  functions  and  can  this  relationship 
be  extended  to  other  classes  of  functions? 

Finally,  the  question  of  second-order  feasible  direction  methods 
should  be  investigated  further  in  light  of  the  generalized  inverse  dual. 
The  research  of  this  topic  falls  into  all  three  categories:   theory, 
algorithm  design,  and  numerical  theory. 

With  respect  to  algorithm  design,  there  are  two  general  areas  for 
future  research,  the  first  being  design  of  an  algorithm  based  on  the 
generalized  inverse  dual  for  semidefinite  problems.  Two  possible 
approaches  to  such  an  algorithm  would  be  to  design  around  the  generalized 
inverse  dual  exclusively  or  to  use  the  generalized  inverse  dual  as  the 
primary  form  and  the  conjugate  function  dual  for  unidirectional  searches, 
because  of  its  simpler  objective  function. 

The  second  area  of  algorithm  design  is  based  on  the  results  reported 
in  Chapter  5.   Specialized  algorithms  for  the  general  Fermat  problem  and 
extensions  to  the  multifacility  problem  of  Cabot  and  Francis  could  be 
significant  improvements  over  currently  available  algorithms. 

A  direct  descendant  of  algorithm  design  is  research  into  the  numerical 
analysis  of  algorithm  implementation.   It  is  first  noted  that  the  iexisting 
implemented  algorithm  for  definite  problems  does  not  take  full  advantage 
of  the  state  of  the  art.  That  is,  an  open  question  remains  as  to  how 
efficiently  can  definite  problems  be  solved  by  use  of  the  generalized 
inverse  dual. 

For  semidefinite  problems  many  questions  of  implementation  require 
research.   For  example,  would  it  be  better  to  compute  Q  (y)  or  solve 
Q(y)x  =  -Hy?  Also,  with  respect  to  an  algorithm  design  combining  the 
generalized  inverse  and  conjugate  function  dual,  what  is  required  in 
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terms  of  computation  to  transfer  from  one  dual  form  to  the  other? 
In  conclusion,  it  can  be  seen  from  this  partial  listing  that 
several  interesting  and  potentially  beneficial  areas  of  research  have 
been  opened  by  the  introduction  of  the  generalized  inverse  dual. 
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APPENDIX  A 
THE  MOORE-PENROSE  GENERALIZED  INVERSE 

Ordinarily  matrices  are  thought  of  as  being  either  nonsingular  or 
singular  and  consequently  having  or  not  having  an  inverse.   In  1920 
Moore  [47]  published  a  paper  on  the  reciprocal  of  a  general  matrix,  but 
it  went  unnoticed  until  1955  when  Penrose  [50],  unaware  of  Moore's  work, 
published  a  paper  on  the  generalized  inverse  of  a  matrix.  Penrose's 
paper  created  the  impetus  for  further  research  and  several  authors  have 
now  added  to  the  theory  of  what  is  known  as  the  Moore-Penrose  generalized 
inverse.  Because  there  is  yet  no  standard  nomenclature,  some  authors 
refer  to  the  Moore-Penrose  pseudo-inverse,  pseudo- inverse,  general 
reciprocal,  Moore-Penrose  generalized  inverse,  generalized  inverse, 
while  others  refer  to  various  other  matrix  inverse  forms  by  these  and 
other  names.  The  recent  work  of  Bouillon  and  Odell  [8]  has  an  extensive 
bibliography  on  the  subject.  Herein,  Moore-Penrose  generalissed  inverse 
and  generalized  inverse  will  be  synonymous. 

The  major  advantage  of  the  Moore-Penrose  generalized  inverse  is 
that  it  exists  and  is  unique  for  any  matrix.  This  permits  a  unified 
theoretical  treatment  of  matrix  calculus  and  greater  flexibility  in  the 
application  of  matrix  methods  to  other  fields. 

As  is  often  the  case  in  matrix  calculus,  initial  efforts  at 
developing  the  theory  of  the  generalized  inverse  were  carried  on  in  the 
field  of  statistics.   A  representative  bibliography  of  these  works  is 
found  in  Bouillon  and  Odell. 
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The  application  of  the  generalized  Inverse  theory  to  operations 
research  has  been  almost  exclusively  In  the  area  of  linear  progrannnlng 
and  carried  out  by  Cllne  [18],  Pyle  [55],  and  Charnes,,  Cooper  and 
Thompson  [13].  Charnes  and  Klrby  [14]  have  addressed  the  generalized 
Inverse  applied  to  a  convex  programming  problem. 

This  appendix  Is  Included  to  support  the  self-contained  nature  of 
the  dissertation  and  presents  a  compilation  of  the  Moore-Penrose 
generalized  inverse  theory  which  is  germane  to  the  extensions  of 
Chapter  2,  the  duality  theory  of  Chapter  3,  and  the  equivalence  concepts 
of  Chapter  4.  The  theory  herein  presented  is  for  real  matrices,  but  the 
extension  to  matrices  over  the  complex  field  is  Immediate. 

The  first  definition  is  due  to  Penrose  [50].  The  second  and 
alternate  definition,  due  to  Moore  [47],  is  used  for  theoretical 
development  of  derivative  forms . 

Definition,  Let  A  be  an  nxm  matrix.  A  miatrix  A~  which  has  the 
following  properties  is  called  a  generalized  Inverse  of  A: 
m  AA  is  symmetric 
[AyL]   A  A  is  symmetric 
(>6t/t)  AA~A  =  A 
Uu)  A~AA~  =  a" 
'   Definition.  For  any  nxm  matrix  A,  the  generalized  Inverse  is 
defined  as 

A~  »  llin  (A*^A  +  62i)~^  a*^ 
6+0 
=  limA*^(AA*^  +  62i)-\ 
6+0 
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A  discussion  and  proof  of  the  equivalence  of  these  two  definitions 
Is  found  In  Albert  [  1  ] . 

Before  proving  the  existence  theorem  for  generalized  Inverses, 
three  lemmas  are  stated  and  proved.  These  lemmas  are  central  to  the 
existence  theorem  proof. 

Lemma  A.l.  Let  M  be  a  matrix  of  order  njcm  of  rank  r,  then  there 
are  matrices  P,  of  order  nxr,  and  Q,  of  order  rxm,  both  of  rank  r  such 
that  M  =  PQ. 

Proof  of  Lemma  A.l.  Suppose  that  n  i  m,  there  exist  nonslngular 
matrices  R  and  S  such  that 


r-Ims-1  = 


I     I    0 
0     I    0 


or         M  =  R 


I      !   0 

iP  i  °j 


s. 


Put  P  =  R 


I         0 
_r_.__ 

0         0 


and     Q  = 


lill 


S, 


0     {    0 

then  M  =  PQ  and  P  and  Q  both  have  rank  r. 

.  Q.E.D. 

Lemma  A.  2.  Let  M  be  an  mxr  matrix  with  rank  r.  Then  MTI  Is 
nonslngular. 

Proof  of  Lemma  A.2.   Since  M  Is  of  rank  r,  there  Is  a  nonslngular 

matrix  P  of  size  rxr  such  that 

"l 
MP 


and 


pV  =   (I^.O)    , 


pSiSip  =  (I  ,0) 
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Thus,  there  exists  a  nonsingular  matrix  P  such  that 

pSiSip  =  I  , 
'■■   r 

mSi  is  now  equivalent  to  I  and  hence  nonsingular. 

Q.E.D. 
Lemma  A. 3.   Let  M  be  an  rxm  matrix  of  rank  r.   Then  MM  is 
nonsingular. 

Proof  of  Lemma  A. 3.  Define  M  =  M  and  apply  Lemma  2  proving  M  M 
is  nonsingular,  but  M  M  =  MM  is  nonsingular. 

Q.E.D. 
Theorem  A.l.  The  generalized  inverse  of  an  nxm  matrix  A,  if  it 
exists,,  is  of  order  mxn.  ' 

Proof  of  Theorem  A.l.   By  the  properties  of  the  generalized 
inverse  and  conformability  requirements  it  is  seen  that  for  AA  and  A  A 
to  be  symmetric  with  A  of  order  nxm,  A  must  be  of  order  mjcn. 

Q.E.D. 
The  existence  theorem  of  the  generalized  inverse  can  now  be 
stated. 

Theorem  A. 2.  Let  A  be  an  nxm  matrix.  Then  A  has  a  generalized 
inverse. 

Proof  of  Theorem  A. 2.   If  A  is  a  zero  matrix  of  order  nxm  then  A 
is  a  zero  matrix  of  mxn.   Assume  that  A  is  not  a  zero  matrix;   By  Lemma 
A.l,  A  can  be  factored  in  the  form 

A  =  PQ 
where  P  is  nxr  and  Q  is  rxm  matrices  of  rank  r.   P  P  and  QQ  are  non- 
singular  by  Leimnas  A. 2  and  A. 3. 
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Put  A"  =  Q^QQ^)-l(P^P)-lpt,  then 
U)      (Ak'f   =  A"V 

=  [Q^(QQ^)"^(p^p)-Vp'']''(pq)'' 
=  p{(p''p)-1}^(qq*')-1}''qqV 

=  p(ptp)-l(QQtj_i^QQt)pt 
=  P(P^P)-lp^ 
Therefore,  AA  Is  synmetrlc  and  satisfies  U)   of  the  definition. 
iU)      (A~A)*^  =  aV 

=  (PQ)^[Q^QQ^)-1(P*'P)-1P*']*' 
=  QV[P{(P*'p)-l}'={(QQ*^)-l}tQ] 

A  A  is  symmetric  and  satisfies  (>6c)  of  the  definition. 

{Ua,)       aa"a  =  pq[q'^(qq')-1(p'^p)-1p']pq 

=  PQ 
^  ,■  ■  =  A,    ■ 
satisfying  (^ccc)  of  the  definition. 

U\>)        A~AA"  =  [Q*^(QQ^)-l(P*=)-lp*^]PQ[Q^(QQ*^)-l(P^P)-lp*^] 
=  Q*=(QQ*=)-l(P*^p)-lp' 
■  =  A".-  • 
The  defined  a"  =  Q^(QQ*^)-^(P*^P)~lp*^  is  therefore  a  generalized  inverse 
of  A  and  always  exists. 

Q.E.D. 
In  the  following,  it  is  shown  that  the  generalized  inverse  as 
defined  is  unique . 

Theorem  A. 3.  Let  A  be  an  nxm  matrix.  The  generalized  Inverse  a" 
of  A  is  unique . 
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Proof  of  Theorem  A. 3.  Suppose  that  A  has  two  generalized  Inverses, 
say  Aj  and  A2.   It  will  first  be  shown  that  AAj  =  AA2. 

AA2  =  (AA2A)A2 
=  (AAi)(AA2) 
=  (AAi)^(AAi)^ 
=  (AA2A)Ai 

=  aaI 

Similarly,  A2A  =  A^A.  Therefore, 
A2  =  A2AA2 . 
=.  (A2A)Ai 
=  (AlA)Ai 
=  Al(AAi) 
<=  AiAAi 

Hence,  the  generalized  Inverse  is  unique. 

Q.E.D. 

Corollary  A. 3.1.  The  generalized  inverse  of  A  is  the  transpose 
of  the  generalized  inverse  of  A;  i.e.,  (A  )  =  (A  )  . 

Proof  of  Corollary  A . 3 . 1 .  The  proof  consists  of  showing  that 
(A~)*^  is  the  generalized  inverse  of  A*^  and  since  the  generalized  inverse 
is  unique  by  Theorem  A. 3,  it  follows  that  (A  )~  =  (A~)  .  Write  A  =  PQ 
as  in  the  proof  of  Theorem  A. 2, 

a"  =  Q^(Qq')-^(p'p)-^P^.  (A.l) 

Also, 

aV^qV 

and  ■ 

(A^)"  =  P(P^P)-1(QQ*')~^Q. 
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Taking  the  transpose  of  (A.l), 

(a")^  =  p(p^p)-1(qq*')-1q. 

Hence, 

and  by  uniqueness  of  the  generalized  inverse  of  A  the  corollary  Is 

proved . 

Q.E.D. 

A  direct  consequence  of  this  corollary  for  symmetric  matrices  is 


the  following: 


(A^)"  =  (A)' 


(A-)^ 


That  is,  the  generalized  inverse  of  a  symmetric  matrix  is  symmetric. 
Corollary  A. 3. 2.  The  following  hold  for  any  mxn  matrix  A: 
(AA*")""  =  (a'")V, 
(A^A)"  =  A"(A")*^. 
Proof  of  Corollary  A. 3. 2.  Clearly,  conformability  requirements 
are  satisfied.  The  corollary  is  then  proved  by  demonstrating  that  the 
four  properties  of  the  definition  are  satisfied. 

Corollary  A. 3. 3.  The  generalized  inverse  of  an  mxn  matrix  A  can 
be  expressed  as 

a"  =  a'^caa'^)" 


or 


a"  =  (A*^A)"A^. 


Proof  of  Corollary  A. 3. 3.  Proof  is  by  demonstrating  that  the 
definition  is  satisfied.  Note  that  conformability  requirements  are  met. 
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For  the  first, 


AA 

.tv- 


by  noting  that  (AA  )~  is  by  definition  the  generalized 
inverse  of  AA  . 
(U)      (A~A)*^  =  (A^(AA*')'A)*^ 
=  A^(AA^)"*^A 

»  a*^(aa'^)~a 

=■  A~A. 

ilU)       aa"a  =  a(a^(aa')")a 

=  AA*^(AA*^)"A 

=  aa*^(a")Va 
=  a(a"a)Va 
=  (aa"a)a"a 

=  AA~A 
=  A. 

Uv)     aaaa"  =  a*^(aa*^)'aa*^(aa')~ 
=  a^(a")Vaa*^(a")V 
=  (a"a)Va(a~a)'^a" 
=  a"aa~a(a"a)a" 
=  a"aa~ 
=  a". 

The  proof  of  the  second  form  follows  in  the  same  manner. 

Q.E.D. 
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Corollary  A. 3.4.  The  generalized  inverse  of  A^  =  |~~1  ^^ 

a"  =  (A"  ,0). 
Proof  of  Corollary  A. 3. 4.   The  proof  is  by  observing  that  the 
four  properties  of  the  definition  are  satisfied. 
Corollary  A. 3. 5. 

(aA)~  =  -  A~   for  aeE^,  a  ^  0. 

Proof  of  Corollary  A. 3.5.  Again,  proof  is  by  noting  that  the 
definition  is  satisfied. 

Theorem  A. 4.   If  A  is  nonsingular,  A  =  A"  . 

Proof  of  Theorem  A. 4.   The  proof  follows  by  noting  that  A~  satis- 
fies the  definition  for  the  generalized  inverse  of  nonsingular  A. 

Theorem  A. 5.  Let  A  be  a  matrix,  then  rank  (A  )  =  rank  (A) . 

Proof  of  Theorem  A. 5. 

rank  (A)  =  rank  (AA  A)  <  rank  (A  ) 

=  rank  (A~AA~)  <  rank  (A) 

Q.E.D. 

Theorem  A. 6.  Let  A  be  an  nxm  matrix.  Then  if 

(a)  rank  (A)  =  n,     A"  =  A*^(AA^)-^ 

(b)  rank  (A)  =  m,     A~  =  (A^A)-^*^. 
From  (a)  AA  =  I  and  from  (b)  A  A  =  I. 

Proof  of  Theorem  A. 6.  The  proof  of  this  theorem  follows 
directly  from  Corollary  A. 3. 4  and  Lemmas  A. 2  and  A. 3. 

Theorem  A. 7.  This  system  of  equations.  Ax  =  b,  is  consistent  if 
and  only  if  AA  b  =  b;  i.e.,  AA  is  the  projector  onto  R(A),  the  range 
space  of  A,  and  so  b  is  in  the  column  space  of  A. 
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Proof  of  Theorem  A. 7.   Suppose  the  system  is  consistent,  then 

there  exists  an  x  such  that  Ax  •=  b,  further 

(AA")Ax  ■=  (AA")b 

Ax  =  AA~b 

b  =  AA~b. 

Suppose  now  that  AA~b  =  b.   Put  x  =  A~b  then  Ax  =  AA  b  =  b. 

Hence,  x  satisfies  Ax  =  b,  which  is  then  consistent. 

Q.E.D. 

Theorem  A. 8.   Let  the  system  of  equations  Ax  =  b  be  consistent. 

Then  the  vector 

x  =  A"b  +  (I-A"A)g 

is  a  solution  to  Ax  =  b  for  every  choice  of  g  in  the  proper  space. 

Moreover,  every  solution  to  Ax  =  b  has  this  form. 

Proof  of  Theorem  A. 8.   For  every  choice  of  g. 

Ax  =  A{A~b}  +  A(I-A"A)g 

=  AA"b  +  Ag  -  AA~Ag 

=  AA~b  +  Ag  -  Ag 

=  b. 

Hence,  x  =  A~b  +  (I-A~A)g  is  a  solution  for  every  g.  Now  let  x  be  any 

solution  of  Ax  =  b,  then 

A~Ax  =  A~b 

0  =  A~b  -  A  Ax, 

adding  x  to  both  sides  of  this  equation, 

X  =  A~b  -  A  Ax  +  X 

X  =  A"b  +  (I-A~A)x 

which  is  the  required  form  with  g  =  x. 

Q.E.D. 
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Corollary  A.8.1.   If  A  is  any  nxm  matrix,  the  following  are  true. 

(a)  The  column  spaces  of  A  and  AA  are  the  same. 

(b)  The  column  spaces  of  A  and  A  A  are  the  same. 

(c)  The  column  space  of  (I-AA  )  is  the  orthogonal  complement 
of  the  column  space  of  A. 

(d)  The  column  space  of  (I-A~A)  is  the  orthogonal  complement 
of  the  column  space  of  A  . 

(e)  The  column  space  of  (I-A  A)  is  the  same  as  the  null  space 
of  A. 

(f)  The  column  space  of  (I-AA  )  is  the  same  as  the  null  space 
of  a". 

Proof  of  Corollary  A. 8.1.  Proof  follows  directly  by  application 
of  the  required  definitions. 

Definition.  An  nxn  matrix  is  said  to  be  idempotent  if  AA  =  A. 
Corollary  A. 8. 2.   (I-A~A)  is  idempotent. 
Proof  of  Corollary  A. 8. 2. 

(i-a"a) (i-a"a)  =  (i-a"a)  -  (i-a"a)a"a 

=  (i-a"a)  -  a"a  +  (A~AA")A 

=  (I-A~A)  -  A"A  +  A~A 

=  (I-A"A). 

Q.E.D. 

Theorem  A. 9.   (I-A~A)  is  an  mxm  symmetric,  positive  semidefinite 
matrix. 

Proof  of  Theorem  A. 9. 

(i-a"a)*^  =  I  -  (a"a)^ 
=  i-a"a, 

hence,  (I-A~A)  is  symmetric  and  idempotent. 
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x^(I-A~A)x  =  x*^(I-A"a)'^(I-a"a)x  >  0. 

Therefore,  (I-A  A)  is  positive  semideflnite., 

Q.E.D. 

Theorem  A. 10.     If  A  is  syimnetric  idempotent,   then  A    =  A; 

i.e.,   if  A  =  A     and  A  =  AA,    then  A     =  A. 

Proof  of  Theorem  A. 10.   The  proof  is  by  showing  that  A  satisfies 

the  definition.  That  is 

iZ)     AA  =  AA  =»  A  is  symmetric 

i'U.)     A~A  =  AA  =  A  is  symmetric 

{iLi)     AA~A  =  AA  =  A 

Uv)   A~AA"  =  AA~  =  A 

Q.E.D. 

Theorem  A. 11.  Let  D  be  a  diagonal  matrix,  then  D  is  a  diagonal 
matrix  with  diagonal  elements  the  reciprocals  of  the  nonzero  diagonal 
elements  of  D  or  zero  if  the  element  is  zero. 

Example: 


D  = 


10  0 
0  2  0 
0  0  0 


D  = 


1  0  0 
0  1/2  0 
0   0   0 


Proof  of  Theorem  A. 11.  The  proof  follows  from  showing  the 
definition  is  satisfied. 


Definition.  Let  A  be  an  m„xn„  matrix  and  let  B  be  an  m.xn  matrix; 

—————— r—_  2   2  .1    1 

then  the  Kronecker  product  of  A  and  B,  which  is  written  AxB,  is  a  matrix 
C  of  order  m.m.xn.n,  defined  by 


p- 

r-i 

^\l 

A  b 
12 

.     A  b. 

^^2l 

Ab^,      .. 

•      A  b,„ 

m^l 

A  b 
m^2 

.     A  b 

Vl 
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Theorem  A, 12.  Let  the  matrix  A  be  defined  by  the  Kronecker 

product 

A  =  B  X  C. 

Then  the  generalized  inverse  of  A  is  given  by 

A~  =  B~  X  C". 

Proof  of  Theorem  A. 12.  From  the  theory  dealing  with  the 

Kronecker  product 

(BxC)(DxE)  =  BD  X  CE 

provided  conformability  requirements  are  met,  and  it  can  be  quickly 

shown  that  A~  =  B~xe~  satisfies  the  definition. 

Q.E.D. 

By  Theorem  A. 8  it  is  seen  that  for  a  particular  linear  system; 

e.g., 

Ax  =  b, 
a  solution  can  be  found  given  the  system  is  consistent.  Obviously,  not 
all  linear  systems  are  consistent  and  therefore  no  x  can  be  found  which 
satisfies  the  system.   It  is  possible,  though,  to  find  approximate 
solutions.  The  following  develops  one  approach  to  defining  an  approxi- 
mate solution  to  a  linear  system  that  is  inconsistent.  For  further 
treatment  of  the  topic  see  Price  [54]. 

Definition.  The  vector  x°  is  defined  to  be  the  best  approximate 
solution  to  the  system  of  equations  (A  is  an  mxn  matrix) 

Ax-b  =  e(x) 
if  and  only  if 

(a)  for  all  x  e  E*^,  the  relationship  (Ax-b )*^  (Ax-b)  >  (Ax°-b)  (Ax°-b) 

holds ; 
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(b)  and  for  those  x  ?*  x°  such  that  (Ax-b)  (Ax-b)  =  (Ax  -b)  (Ax  -b), 

the  relationship  X  x  >  x  x  holds. 
Based  on  the  above  definition  of  what  will  be  termed  the  best 
approximate  solution,  the  following  application  of  the  generalized 
inverse  results . 

Theorem  A.13.  The  best  approximate  solution  to  the  system  of 
equations  Ax  =  b  is  x  where 
X  =  A  b. 
Proof  of  Theorem  A.13. 

(Ax-b )*^ (Ax-b)  -  (Ax-AA"b  +  AA"b-b)^(Ax-AA"b  +  AA"b-b) 

=  [A(x-A"b)  +  (AA~-I)b]^[A(x-A"b)  +  (AA"-I)b] 
=  [A(x-A~b)]^[A(x-A~b)]  +  [(AA'-Dbl'^KAA'-Db] 
>  [(AA"-I)b]^[(AA"-I)b] 
where  the  cross  product  terms  are  seen  to  be  zero.  This  inequality  holds 
for  all  xeE".  Letting  x°  »  A~b, 

(Ax-b)'(Ax-b)  >  (Ax°-b)'^(Ax°-b) 
for  all  X  e  e"*.  Equality  holds  if  and  only  if 

[A(x-A"b)]^[A(x-A"b)]  =  0, 
i.e.,  if  and  only  if 
Ax  «  AA  b. 
Now  it  must  be  shown  that  for  all  x  such  that  Ax  =  AA  b,  the 
relationship  x^x  >  x°^x°  holds.   For  all  x  £  E  such  that  Ax  =  AA  b. 

A~b  =  A~Ax  and  x  =  A  b-A  Ax  +  x, 
hence, 

x'x  =  (A"b  +  (I-A"A)x)*^(A"b  +  (I-A"a)x) 

=  (A"b)*^(A"b)  +  [(I-a"A)x]'[(I-a"a)x], 
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x^x  =  (A"b)^(A"b)  +  (x-A'b)*^(x-A"b) 


and  if  X  /  X  ,  X  =  A  b, 


t     ot  o 

X  X  >  X   X  . 


Q.E.D. 
Corollary  A. 13.1.  The  best  approximate  solution  always  exists  and 
is  unique. 

Proof  of  Corollary  A. 13.1.  The  best  approximate  solution  is  defined 
as  x°  =  A~b  and  by  Theorems  A. 2  and  A. 3  always  exists  and  is  \mique. 

Q.E.D. 
The  following  theorems  and  corollary  are  due  to  Cline  [16,17]. 
Complete  proofs  are  available  in  Cline 's  paper  and  will  not  be  included 
herei  Theorem  A. 15  was  first  constructively  proved  by  Greville  [33]. 
The  statements  here  given  implicitly  reflect  real  matrices,  but  the  theory 
holds  for  complex  matrices  using  conjugate  transpose. 

Theorem  A. 14.  For  any  matrices,  U  and  V,  the  generalized  inverse 
of  the  sum  Uu'  +  W  can  be  written  in  the  form 

[UU*^  +  W*^]'  =  (I-C"V)U"*^[I-U"V(I-C"C)KVV*']U"(I-VC")  +  C"V. 
where 

C  =  (I-UU")V 
and 

K=  [1+  (I-C"C)VWV(I-C"C)]~  . 
Corollary  A. 14.1. 

[uu^  +  W*^]"  =  u"V-u"VvKvW 

if  and  only  if 


C  =  0  with  K  =   [I  +  vWv]"\ 
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Theorem  A. 15.  B  -  [U»V]  then 

_U~(I-VJ) 
J 


where 


and 


J  =  C"  +  (I-C"C)KVV^U~(I-VC)., 
C  -  (I-UU")V, 


K  =  {l:+  [U-V(I-C"C)]^[u"v(I-C"C)]r  . 


One  significant  disadvantage  in  the  use  of  the  generalized  inverse 
is  the  fact  that  (AB)~  i   B~A~  in  many  cases.  There  are  instances  though 
when  the  form  is  applicable  as  defined  in  the  following  theorem;  The 
proof  has  not  been  included,  but  can  be  found  in  Bouillon  and  Odell  [  8] 
or  Albert  [  1  ]. 

Theorem  A.  16.   (AB)"  -  b"a"  if  and  only  if  A'ABb'^A  =  BB  A  and 
BB'a'^AB  =  A*^AB. 


APPENDIX  B 
THE  TRACE  FUNCTION 

A  well-known  function  of  square  matrices  is  the  trace  function,  tr, 
tr  :  e"™  -.  E^ 

In  order  to  facilitate  the  theoretical  developments  in  Chapter  2, 
the  following  known  theorems  and  proofs  are  presented. 

Definition.   The  trace  of  an  nxn  matrix  A,  which  is  designated  by 

tr(A),  is  defined  to  be  the  sum  of  the  diagonal  elements  of  A;  that  is 

n 
tr(A)  =  y  a,, 
i=l  ^^ 

Theorem  B.l.  Let  A  and  B  be  nxn  matrices,  then 

tr(AB)  =  tr(BA). 

Proof  of  Theorem  B.l.   Put  AB  =  C,  then 

n 
c   =  y  a  .b,  . 

pq  jii  pj  jq 

Now  put  BA  =  G,  then 


n 
By  definition 


rs  =  X   ^1^8- 


tr(AB)  =  tr(C)  =  J  c   =  I   I     a  b 

p=l  PP   p=l  j=l  PJ  JP 

Also, 

tr(BA)  =  tr(G)  =   I  g^^  =   I   I    b  a 

r=l  ^^       r=l  i=l  ^^  ^^ 

Thus,  it  is  seen  that  tr(AB)  =  tr(BA). 
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Q.E.D. 
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Another  well-known  theorem  on  the  trace  function  is  the  following. 
Theorem  B.2.   If  A  ,  J  =  l,2,...,m  are  nxn  matrices  and  a.  is 

scalar,  then 

m  m 

tr  [I   a  A  ]  =  I   a  tr(A  ). 
j-1  -"J     j=l  ^    -' 

The  remaining  theorems  are  not  as  well  known  and  due  to  their 
importance  proofs  will  be  given. 

Theorem  B.3.  Let  A  be  an  mxn  matrix;  then  tr(A  A)  =  0  if  and 
only  if  A  =  0,  the  zero  matrix. 

Proof  of  Theorem  B.3.   If  A  «=  0,  then  A^A  =  0  and  certainly 
tr(A^A)  =  0.  Assume  now  that  tr(A^A)  =  0.  A*^A  is  a  positive  semidefinite 
matrix  and  by  tr(A'A)  =  0  each  diagonal  element  is  zero.  By  definition 

of  the  j  diagonal  elements  of  A  A, 

n 

I    aj  =  0,  for  j  =  1,2 n, 

i<=l  ^J 

it  is  clear  that  a, .  =  0.  Hence  A  =  0. 

Q.E.D. 

Theorem  B.4.  Let  A  and  B  be  nxn  positive  semidefinite  matrices; 

then 

tr(AB)  =  0,  if  and  only  if  AB  =  0,  the  zero  matrix. 
Proof  of  Theorem  B.4.  Clearly,  If  AB  =  0,  then  tr(AB)  =  0.  To 
prove  the  "only  if,"  note  that  since  A  and  B  are  positive  semidefinite, 
by  Lemma  2.3,  there  exist  matrices  U  and  V  such  that  A  =  U  U  and 
B  =  W*^, 

tr(AB)  =  tr(U^Uw')  =  tr(vVuV). 
The  last  equality  follows  from  Theorem  B.l.  By  the  hypothesis, 
tr(AB)  =0,  and  tr(AB)  =  tr[(UV)'^(UV)],  it  follows  from  Theorem  B.3  that 

UV  =  0. 
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Premul tip lying  by  U  and  postmultiplying  by  V  results  in 

U*^UW*^  -=  AB  =  0. 

Q.E.D. 

Corollary  B.4.1.   Let  A  and  B  be  nxn  positive  semidefinite 

matrices;  then 

tr(AB)  >  0. 

Proof  of  Corollary  B.4.1.   As  in  the  proof  of  Theorem  B.4, 

tr(AB)  =  tr[(UV)*^(UV)]. 

Let  UV  =  C,  then 

tr(AB)  =  tr(c'c)  =     I       J  c,,  ^  0. 
j=l  i=l  ^J 

Q.E.D. 
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